Config + Docs: Clarify YaRN rope scaling changes

In ExllamaV2, if a model has YaRN support, linear RoPE options are
not applied. Users can set max_seq_len and exl2 will take care of
the rest.

Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
This commit is contained in:
kingbri 2025-03-19 11:47:49 -04:00
parent a20abe2d33
commit 698d8339cb
2 changed files with 5 additions and 2 deletions

View file

@ -95,6 +95,9 @@ model:
# Used with tensor parallelism.
gpu_split: []
# NOTE: If a model has YaRN rope scaling, it will automatically be enabled by ExLlama.
# rope_scale and rope_alpha settings won't apply in this case.
# Rope scale (default: 1.0).
# Same as compress_pos_emb.
# Use if the model was trained on long context with rope.