Embeddings: Switch to Infinity

Infinity-emb is an async batching engine for embeddings. This is
preferable to sentence-transformers since it handles scalable usecases
without the need for external thread intervention.

Signed-off-by: kingbri <bdashore3@proton.me>
This commit is contained in:
kingbri 2024-07-29 13:42:03 -04:00
parent c9a5d2c363
commit 3f21d9ef96
4 changed files with 87 additions and 100 deletions

View file

@ -72,6 +72,13 @@ developer:
# Otherwise, the priority will be set to high
#realtime_process_priority: False
embeddings:
embeddings_model_dir: models
embeddings_model_name:
embeddings_device: cpu
# Options for model overrides and loading
# Please read the comments to understand how arguments are handled between initial and API loads
model: