Model: Raise an error if the context length is too large

The dynamic generator gave a not-so-helpful exception already which
basically said to not exceed the max sequence length. Instead of
possible undefined behavior, error out.

Signed-off-by: kingbri <bdashore3@proton.me>
This commit is contained in:
kingbri 2024-09-19 22:05:56 -04:00
parent b30336c75b
commit 75af974c88

View file

@ -1228,10 +1228,9 @@ class ExllamaV2Container:
# The first index will always be the positive prompt
context_len = input_ids[0].size(dim=-1)
if context_len > self.config.max_seq_len:
logger.warning(
raise ValueError(
f"Context length {context_len} is greater than max_seq_len "
f"{self.config.max_seq_len}. Generation is truncated and "
"metrics may not be accurate."
f"{self.config.max_seq_len}"
)
# Automatically set max_tokens to fill up the context