diff --git a/docs/03.-Usage.md b/docs/03.-Usage.md index 609b5cb..e8748b4 100644 --- a/docs/03.-Usage.md +++ b/docs/03.-Usage.md @@ -94,21 +94,20 @@ To get started, set `inline_model_loading` to `true` under the model block of co Now to create a tabby config, let's say we have a model in our models directory called `Meta-Llama-3-8B-exl2`. Navigate into that model folder and create a file called `tabby_config.yml` -> [!NOTE] -> The formatting for tabby_config.yml may change in the future for consistency with config.yml. Please keep an eye out for breaking changes. - Now, you can place any model load parameter from `/v1/model/load` into that file. Here's a simple example which changes the default `max_seq_len` to 8192 and sets a Q6 quantized cache: ```yml -max_seq_len: 8192 -cache_mode: Q6 +model: + max_seq_len: 8192 + cache_mode: Q6 ``` If you'd like to provide draft model options, you can add them under the `draft_model` key: ```yml -max_seq_len: 8192 -cache_mode: Q6 +model: + max_seq_len: 8192 + cache_mode: Q6 draft_model: draft_model_name: TinyLlama-1B-32k-exl2 draft_rope_scale: 1.0