use_as_default was not being properly applied into model overrides.
For compartmentalization's sake, apply all overrides in a single function
to avoid clutter.
In addition, fix where the traditional /v1/model/load endpoint checks
for draft options. These can be applied via an inline config, so let
any failures fallthrough.
Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
The model card is a unified structure for sharing model params.
Rather than kwargs, use this instead.
Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
If an API key sends a dummy model, it shouldn't error as the server
is catering to clients that expect specific OAI model names. This
is a problem with inline model loading since these names would error
by default. Therefore, add an exception if the provided name is in the
dummy model names (which also doubles as inline strict exceptions).
However, the dummy model names weren't configurable, so add a new
option to specify exception names, otherwise the default is gpt-3.5-turbo.
Signed-off-by: kingbri <bdashore3@proton.me>
* Model: Fix inline loading and draft key
There was a lack of foresight between the new config.yml and how
it was structured. The "draft" key became "draft_model" without updating
both the API request and inline loading keys.
For the API requests, still support "draft" as legacy, but the "draft_model"
key is preferred.
Signed-off-by: kingbri <bdashore3@proton.me>
* OAI: Add draft model dir to inline load
Was not pushed before and caused errors of the kwargs being None.
Signed-off-by: kingbri <bdashore3@proton.me>
* Model: Fix draft args application
Draft model args weren't applying since there was a reset due to how
the old override behavior worked.
Signed-off-by: kingbri <bdashore3@proton.me>
* OAI: Change embedding model load params
Use embedding_model_name to be inline with the config.
Signed-off-by: kingbri <bdashore3@proton.me>
* API: Fix parameter for draft model load
Alias name to draft_model_name.
Signed-off-by: kingbri <bdashore3@proton.me>
* API: Fix parameter for template switch
Add prompt_template_name to be more descriptive.
Signed-off-by: kingbri <bdashore3@proton.me>
* API: Fix parameter for model load
Alias name to model_name for config parity.
Signed-off-by: kingbri <bdashore3@proton.me>
* API: Add alias documentation
Signed-off-by: kingbri <bdashore3@proton.me>
---------
Signed-off-by: kingbri <bdashore3@proton.me>
It's best to pass them down the config stack.
API/User config.yml -> model config.yml -> model config.json -> fallback.
Doing this allows for seamless flow and yielding control to each
member in the stack.
Signed-off-by: kingbri <bdashore3@proton.me>
Storing a pathlib type makes it easier to manipulate the model
directory path in the long run without constantly fetching it
from the config.
Signed-off-by: kingbri <bdashore3@proton.me>
Embedding models are managed on a separate backend, but are run
in parallel with the model itself. Therefore, manage this in a separate
container with separate routes.
Signed-off-by: kingbri <bdashore3@proton.me>
Place OAI specific routes in the appropriate folder. This is in
preperation for adding new API servers that can be optionally enabled.
Signed-off-by: kingbri <bdashore3@proton.me>