Matching YALS, if the model has add_bos_token enabled, then remove
an extra BOS token at the start of the prompt. This usually happens
with misconfigured templates such as Llama 3.
Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
Tools must be None by default. Chat completion message content can
be None, a string, or a list, so default to None. Exclude all None
values from a CC message since the template can say the variable
"exists" despite being None, causing an error.
Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
Like YALS, logging all pertinent information after model load makes
it easier to parse by the user.
Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
Messages were mistakenly being sent as Pydantic objects, but templates
expect dictionaries. Properly convert these before render.
In addition, initialize all Optional lists as an empty list since
this will cause the least problems when interacting with other parts
of API code, such as templates.
Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
Some packages such as ExllamaV2 and V3 require specific versions for
the latest features. Rather than creating repetitive functions, create
an agnostic function to check the installed package and then report
to the user to upgrade.
This is also sent to requests for loading and unloading, so keep the
error short.
Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
The HFModel class serves to coalesce all config files that contain
random keys which are required for model usage.
Adding this base class allows us to expand as HuggingFace randomly
changes their JSON schemas over time, reducing the brunt that backend
devs need to feel when their next model isn't supported.
Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
This parameter is way too confusing and does not make sense in
the modern LLM space.
Change approved by all maintainers.
Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
If an inference dep isn't present, force exit the application. This
occurs after all subcommands have been appropriately processed.
Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
Seemed out of place in the common load function. In addition, rename
the transformers utils signature which actually takes a directory
instead of a file.
Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
Adding a comma in the description converts the string to a tuple,
which isn't parseable by argparse's help.
Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>