Config + Docs: Clarify YaRN rope scaling changes

In ExllamaV2, if a model has YaRN support, linear RoPE options are not applied. Users can set max_seq_len and exl2 will take care of the rest. Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
2025-03-19 11:47:49 -04:00 · 2025-03-19 11:47:49 -04:00 · 698d8339cb
commit 698d8339cb
parent a20abe2d33
2 changed files with 5 additions and 2 deletions
--- a/config_sample.yml
+++ b/config_sample.yml
@ -95,6 +95,9 @@ model:
  # Used with tensor parallelism.
  gpu_split: []

+  # NOTE: If a model has YaRN rope scaling, it will automatically be enabled by ExLlama.
+  # rope_scale and rope_alpha settings won't apply in this case.
+
  # Rope scale (default: 1.0).
  # Same as compress_pos_emb.
  # Use if the model was trained on long context with rope.