tabbyAPI-ollama/docs/04.-Chat-Completions.md at ccf23243c13769409603a531306cdcbdd7fcac09

kingbri 5614b342a7 Tree: Migrate docs into repository

This will auto-publish to the Github wiki via an action.

Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>

2025-02-17 23:39:35 -05:00

1.9 KiB

Raw Blame History

Chat Completions

TabbyAPI builds on top of the HuggingFace "chat templates" standard for OAI style chat completions (/v1/chat/completions).

If you'd like more detail, look at the autogenerated documentation.

Custom Templates

By default, TabbyAPI will try to pull the chat template from a model's chat_template key within a model's tokenizer_config.json, but you can also make a custom jinja file. To learn how to create a HuggingFace compatible jinja2 template, Please read Huggingface's documentation.

If you create a custom template for a model, consider PRing it to the templates repository

In addition, there's also support to specify stopping strings within the chat template. This can be achieved by adding {%- set stop_strings = ["string1"] -%} at the top of the jinja file. In this case, string1 will be appended to your completion as a stopping string.

Warning

Make sure to add {%- -%} for any top-level metadata. If this is not provided, the top of the rendered prompt will have extra whitespace. This does not apply for comments {# #}

To use a custom template, place it in the templates folder, and make sure to set the prompt_template field in config.yml (see model config) to the template's filename.

Template Variables

A chat completions request to TabbyAPI also supports custom template variables in the form of a key/value object in the JSON body. Here's an example:

"template_vars": {
    "test_var": "hello!"
}

Now let's pass the custom var in the following template:

I'm going to say {{ test_var }}

Running render on this template will now result in: I'm going to say hello!

1.9 KiB Raw Blame History

Chat Completions

Custom Templates

Template Variables

1.9 KiB

Raw Blame History