Model: Fix inline loading and draft key (#225)

* Model: Fix inline loading and draft key

There was a lack of foresight between the new config.yml and how
it was structured. The "draft" key became "draft_model" without updating
both the API request and inline loading keys.

For the API requests, still support "draft" as legacy, but the "draft_model"
key is preferred.

Signed-off-by: kingbri <bdashore3@proton.me>

* OAI: Add draft model dir to inline load

Was not pushed before and caused errors of the kwargs being None.

Signed-off-by: kingbri <bdashore3@proton.me>

* Model: Fix draft args application

Draft model args weren't applying since there was a reset due to how
the old override behavior worked.

Signed-off-by: kingbri <bdashore3@proton.me>

* OAI: Change embedding model load params

Use embedding_model_name to be inline with the config.

Signed-off-by: kingbri <bdashore3@proton.me>

* API: Fix parameter for draft model load

Alias name to draft_model_name.

Signed-off-by: kingbri <bdashore3@proton.me>

* API: Fix parameter for template switch

Add prompt_template_name to be more descriptive.

Signed-off-by: kingbri <bdashore3@proton.me>

* API: Fix parameter for model load

Alias name to model_name for config parity.

Signed-off-by: kingbri <bdashore3@proton.me>

* API: Add alias documentation

Signed-off-by: kingbri <bdashore3@proton.me>

---------

Signed-off-by: kingbri <bdashore3@proton.me>

This commit is contained in:

Brian Dashore

2024-10-24 23:35:05 -04:00

• committed by

GitHub

parent f20857cb34

commit 6e48bb420a

No known key found for this signature in database

GPG key ID: B5690EEEBB952194

7 changed files with 68 additions and 46 deletions

									
										2

main.py
									
										View file
										
				@ -70,7 +70,7 @@ async def entrypoint_async():

				        await model.load_model(

				            model_path.resolve(),

				            **config.model.model_dump(exclude_none=True),

				            draft=config.draft_model.model_dump(exclude_none=True),

				            draft_model=config.draft_model.model_dump(exclude_none=True),

				        )

				        # Load loras after loading the model

Rows
Columns

Model: Fix inline loading and draft key (#225)

2 main.py Unescape Escape View file

2

main.py

View file