tabbyAPI-ollama/backends
kingbri bdc5189a4b Exl3: Add chunk size, cache size, and model info
Use the same algorithm for estimating and adjusting cache size based
on multiples of 256 and above max seq len.

Same applies for chunk size.

Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
2025-05-02 21:33:25 -04:00
..
exllamav2 Exl3: Add chunk size, cache size, and model info 2025-05-02 21:33:25 -04:00
exllamav3 Exl3: Add chunk size, cache size, and model info 2025-05-02 21:33:25 -04:00
infinity Model: Add proper jobs cleanup and fix var calls 2025-04-24 21:30:55 -04:00
base_model_container.py Model: Add exl3 and associated load functions 2025-05-02 21:32:39 -04:00