tabbyAPI-ollama/backends/exllamav3
kingbri bdc5189a4b Exl3: Add chunk size, cache size, and model info
Use the same algorithm for estimating and adjusting cache size based
on multiples of 256 and above max seq len.

Same applies for chunk size.

Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
2025-05-02 21:33:25 -04:00
..
model.py Exl3: Add chunk size, cache size, and model info 2025-05-02 21:33:25 -04:00