tabbyAPI-ollama

History

kingbri 2096c9bad2 Model: Default max_seq_len to 4096 A common problem in TabbyAPI is that users who want to get up and running with a model always had issues with max_seq_len causing OOMs. This is because model devs set max context values in the millions which requires a lot of VRAM. To idiot-proof first time setup, make the fallback default 4096 so users can run their models. If a user still wants to use the model's max_seq_len, set it to -1. Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>		2025-06-13 14:57:24 -04:00
..
actions.py	Downloader: log errors when downloading	2025-02-19 23:16:17 -05:00
args.py	Args: Expose api-servers to subcommands	2025-02-10 23:39:46 -05:00
auth.py	Tree: Format	2024-09-18 20:36:17 -04:00
concurrency.py	API + Model: Add blocks and checks for various load requests	2024-05-25 21:16:14 -04:00
config_models.py	Model: Default max_seq_len to 4096	2025-06-13 14:57:24 -04:00
downloader.py	Downloader: log errors when downloading	2025-02-19 23:16:17 -05:00
gen_logging.py	Model: Add prompt logging to ExllamaV3	2025-05-17 22:05:18 -04:00
hardware.py	Common: Add hardware file	2025-05-02 21:33:25 -04:00
health.py	Add health check monitoring for EXL2 errors (#206 )	2024-09-22 21:40:36 -04:00
logger.py	FIx logs path	2025-04-22 21:14:45 -04:00
model.py	Model: Default max_seq_len to 4096	2025-06-13 14:57:24 -04:00
multimodal.py	Tree: Format	2025-02-07 18:03:33 -05:00
networking.py	remove unused imports	2024-09-11 18:00:29 +01:00
optional_dependencies.py	Dependencies: Fix unsupported dependency error	2025-06-13 14:57:02 -04:00
sampling.py	Sampling: Make add_bos_token override concise	2025-05-10 19:07:35 -04:00
signals.py	Signals: Split signal handler between sync and async	2024-09-19 23:31:29 -04:00
tabby_config.py	Cleanup config file loader (#208 )	2024-09-23 21:42:01 -04:00
templating.py	Model: Add TokenizerConfig stub and add_eos_token fallback	2025-05-02 00:08:01 -04:00
transformers_utils.py	Model: Default max_seq_len to 4096	2025-06-13 14:57:24 -04:00
utils.py	Tree: Format + Lint	2025-04-26 02:14:30 -04:00