This will auto-publish to the Github wiki via an action. Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
1.9 KiB
Chat Completions
TabbyAPI builds on top of the HuggingFace "chat templates" standard for OAI style chat completions (/v1/chat/completions).
If you'd like more detail, look at the autogenerated documentation.
Custom Templates
By default, TabbyAPI will try to pull the chat template from a model's chat_template key within a model's tokenizer_config.json, but you can also make a custom jinja file. To learn how to create a HuggingFace compatible jinja2 template, Please read Huggingface's documentation.
If you create a custom template for a model, consider PRing it to the templates repository
In addition, there's also support to specify stopping strings within the chat template. This can be achieved by adding {%- set stop_strings = ["string1"] -%} at the top of the jinja file. In this case, string1 will be appended to your completion as a stopping string.
Warning
Make sure to add
{%- -%}for any top-level metadata. If this is not provided, the top of the rendered prompt will have extra whitespace. This does not apply for comments{# #}
To use a custom template, place it in the templates folder, and make sure to set the prompt_template field in config.yml (see model config) to the template's filename.
Template Variables
A chat completions request to TabbyAPI also supports custom template variables in the form of a key/value object in the JSON body. Here's an example:
"template_vars": {
"test_var": "hello!"
}
Now let's pass the custom var in the following template:
I'm going to say {{ test_var }}
Running render on this template will now result in: I'm going to say hello!