Docs: Edit inline loading for breaking changes

Add the model key for the YAML examples. Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
2025-07-24 18:11:42 -04:00 · 2025-07-24 18:11:42 -04:00 · e77fa0b7a8
commit e77fa0b7a8
parent ab04a6ed60
1 changed files with 6 additions and 7 deletions
--- a/docs/03.-Usage.md
+++ b/docs/03.-Usage.md
@ -94,21 +94,20 @@ To get started, set `inline_model_loading` to `true` under the model block of co

 Now to create a tabby config, let's say we have a model in our models directory called `Meta-Llama-3-8B-exl2`. Navigate into that model folder and create a file called `tabby_config.yml`

-> [!NOTE]
-> The formatting for tabby_config.yml may change in the future for consistency with config.yml. Please keep an eye out for breaking changes.
-
 Now, you can place any model load parameter from `/v1/model/load` into that file. Here's a simple example which changes the default `max_seq_len` to 8192 and sets a Q6 quantized cache:

 ```yml
-max_seq_len: 8192
-cache_mode: Q6
+model:
+  max_seq_len: 8192
+  cache_mode: Q6
 ```

 If you'd like to provide draft model options, you can add them under the `draft_model` key:

 ```yml
-max_seq_len: 8192
-cache_mode: Q6
+model:
+  max_seq_len: 8192
+  cache_mode: Q6
 draft_model:
 	draft_model_name: TinyLlama-1B-32k-exl2
 	draft_rope_scale: 1.0