Config + Docs: Clarify YaRN rope scaling changes
In ExllamaV2, if a model has YaRN support, linear RoPE options are not applied. Users can set max_seq_len and exl2 will take care of the rest. Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
This commit is contained in:
parent
a20abe2d33
commit
698d8339cb
2 changed files with 5 additions and 2 deletions
|
|
@ -95,6 +95,9 @@ model:
|
|||
# Used with tensor parallelism.
|
||||
gpu_split: []
|
||||
|
||||
# NOTE: If a model has YaRN rope scaling, it will automatically be enabled by ExLlama.
|
||||
# rope_scale and rope_alpha settings won't apply in this case.
|
||||
|
||||
# Rope scale (default: 1.0).
|
||||
# Same as compress_pos_emb.
|
||||
# Use if the model was trained on long context with rope.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue