tabbyAPI-ollama/docs/04.-Chat-Completions.md
kingbri 7900b72848 API: Add chat_template_kwargs alias for template_vars
This key is used in VLLM and SGLang.

Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
2025-05-12 15:48:39 -04:00

34 lines
No EOL
2 KiB
Markdown

## Chat Completions
TabbyAPI builds on top of the HuggingFace "chat templates" standard for OAI style chat completions (`/v1/chat/completions`).
If you'd like more detail, look at the [autogenerated documentation](https://theroyallab.github.io/tabbyAPI/#operation/chat_completion_request_v1_chat_completions_post).
### Custom Templates
By default, TabbyAPI will try to pull the chat template from a model's `chat_template` key within a model's `tokenizer_config.json`, but you can also make a custom jinja file. To learn how to create a HuggingFace compatible jinja2 template, Please read [Huggingface's documentation](https://huggingface.co/docs/transformers/main/chat_templating).
If you create a custom template for a model, consider PRing it to the [templates repository](https://github.com/theroyallab/llm-prompt-templates)
In addition, there's also support to specify stopping strings within the chat template. This can be achieved by adding `{%- set stop_strings = ["string1"] -%}` at the top of the jinja file. In this case, `string1` will be appended to your completion as a stopping string.
> [!WARNING]
> Make sure to add `{%- -%}` for any top-level metadata. If this is not provided, the top of the rendered prompt will have extra whitespace. This does not apply for comments `{# #}`
To use a custom template, place it in the templates folder, and make sure to set the `prompt_template` field in `config.yml` (see [model config](https://github.com/theroyallab/tabbyAPI/wiki/2.-Server-options#model-options)) to the template's filename.
### Template Variables
A chat completions request to TabbyAPI also supports custom template variables in the form of a key/value object in the JSON body. Here's an example:
```json
"template_vars": {
"test_var": "hello!"
}
```
> [!NOTE]
> To preserve compatibility with other standards, `chat_template_kwargs` can be used instead of `template_vars`
Now let's pass the custom var in the following template:
```jinja2
I'm going to say {{ test_var }}
```
Running render on this template will now result in: `I'm going to say hello!`