Model: Fix inline loading and draft key (#225)

* Model: Fix inline loading and draft key

There was a lack of foresight between the new config.yml and how
it was structured. The "draft" key became "draft_model" without updating
both the API request and inline loading keys.

For the API requests, still support "draft" as legacy, but the "draft_model"
key is preferred.

Signed-off-by: kingbri <bdashore3@proton.me>

* OAI: Add draft model dir to inline load

Was not pushed before and caused errors of the kwargs being None.

Signed-off-by: kingbri <bdashore3@proton.me>

* Model: Fix draft args application

Draft model args weren't applying since there was a reset due to how
the old override behavior worked.

Signed-off-by: kingbri <bdashore3@proton.me>

* OAI: Change embedding model load params

Use embedding_model_name to be inline with the config.

Signed-off-by: kingbri <bdashore3@proton.me>

* API: Fix parameter for draft model load

Alias name to draft_model_name.

Signed-off-by: kingbri <bdashore3@proton.me>

* API: Fix parameter for template switch

Add prompt_template_name to be more descriptive.

Signed-off-by: kingbri <bdashore3@proton.me>

* API: Fix parameter for model load

Alias name to model_name for config parity.

Signed-off-by: kingbri <bdashore3@proton.me>

* API: Add alias documentation

Signed-off-by: kingbri <bdashore3@proton.me>

---------

Signed-off-by: kingbri <bdashore3@proton.me>
This commit is contained in:
Brian Dashore 2024-10-24 23:35:05 -04:00 committed by GitHub
parent f20857cb34
commit 6e48bb420a
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
7 changed files with 68 additions and 46 deletions

View file

@ -70,7 +70,7 @@ async def entrypoint_async():
await model.load_model(
model_path.resolve(),
**config.model.model_dump(exclude_none=True),
draft=config.draft_model.model_dump(exclude_none=True),
draft_model=config.draft_model.model_dump(exclude_none=True),
)
# Load loras after loading the model