Model: Fix inline loading and draft key (#225)
* Model: Fix inline loading and draft key There was a lack of foresight between the new config.yml and how it was structured. The "draft" key became "draft_model" without updating both the API request and inline loading keys. For the API requests, still support "draft" as legacy, but the "draft_model" key is preferred. Signed-off-by: kingbri <bdashore3@proton.me> * OAI: Add draft model dir to inline load Was not pushed before and caused errors of the kwargs being None. Signed-off-by: kingbri <bdashore3@proton.me> * Model: Fix draft args application Draft model args weren't applying since there was a reset due to how the old override behavior worked. Signed-off-by: kingbri <bdashore3@proton.me> * OAI: Change embedding model load params Use embedding_model_name to be inline with the config. Signed-off-by: kingbri <bdashore3@proton.me> * API: Fix parameter for draft model load Alias name to draft_model_name. Signed-off-by: kingbri <bdashore3@proton.me> * API: Fix parameter for template switch Add prompt_template_name to be more descriptive. Signed-off-by: kingbri <bdashore3@proton.me> * API: Fix parameter for model load Alias name to model_name for config parity. Signed-off-by: kingbri <bdashore3@proton.me> * API: Add alias documentation Signed-off-by: kingbri <bdashore3@proton.me> --------- Signed-off-by: kingbri <bdashore3@proton.me>
This commit is contained in:
parent
f20857cb34
commit
6e48bb420a
7 changed files with 68 additions and 46 deletions
2
main.py
2
main.py
|
|
@ -70,7 +70,7 @@ async def entrypoint_async():
|
|||
await model.load_model(
|
||||
model_path.resolve(),
|
||||
**config.model.model_dump(exclude_none=True),
|
||||
draft=config.draft_model.model_dump(exclude_none=True),
|
||||
draft_model=config.draft_model.model_dump(exclude_none=True),
|
||||
)
|
||||
|
||||
# Load loras after loading the model
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue