tabbyAPI-ollama

History

kingbri 2096c9bad2 Model: Default max_seq_len to 4096 A common problem in TabbyAPI is that users who want to get up and running with a model always had issues with max_seq_len causing OOMs. This is because model devs set max context values in the millions which requires a lot of VRAM. To idiot-proof first time setup, make the fallback default 4096 so users can run their models. If a user still wants to use the model's max_seq_len, set it to -1. Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>		2025-06-13 14:57:24 -04:00
..
exllamav2	Model: Default max_seq_len to 4096	2025-06-13 14:57:24 -04:00
exllamav3	Dependencies: Bump ExllamaV3 and ExllamaV2	2025-05-31 23:55:04 +02:00
infinity	Model: Add proper jobs cleanup and fix var calls	2025-04-24 21:30:55 -04:00
base_model_container.py	Model: Create universal HFModel class	2025-05-13 18:12:38 -04:00