tabbyAPI-ollama

History

kingbri beb6d8faa5 Model: Adjust draft_gpu_split and add to config The previous code overrode the existing gpu split and device idx values. This now sets an independent draft_gpu_split value and adjusts the gpu_devices check only if the draft_gpu_split array is larger than the gpu_split array. Draft gpu split is not Tensor Parallel, and defaults to gpu_split_auto if a split is not provided. Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>		2025-02-08 16:09:46 -05:00
..
actions.py	fix issues with optional dependencies (#204 )	2024-09-19 22:24:55 -04:00
args.py	Start: Fix startup with new argparser	2024-09-21 14:36:21 -04:00
auth.py	Tree: Format	2024-09-18 20:36:17 -04:00
concurrency.py	API + Model: Add blocks and checks for various load requests	2024-05-25 21:16:14 -04:00
config_models.py	Model: Adjust draft_gpu_split and add to config	2025-02-08 16:09:46 -05:00
downloader.py	config is now backed by pydantic (WIP)	2024-09-05 18:04:56 +01:00
gen_logging.py	Logging: Remove preferences global	2024-09-14 21:49:44 -04:00
health.py	Add health check monitoring for EXL2 errors (#206 )	2024-09-22 21:40:36 -04:00
logger.py	Logger: Add timestamps	2025-02-07 18:40:28 -05:00
model.py	Model: Fix load packets	2024-11-21 18:06:47 -05:00
multimodal.py	Tree: Format	2025-02-07 18:03:33 -05:00
networking.py	remove unused imports	2024-09-11 18:00:29 +01:00
optional_dependencies.py	Dependencies: Remove outlines from optional check	2024-12-18 11:56:40 -05:00
sampling.py	Sampling: Add max_completion_tokens	2024-12-13 01:02:37 -05:00
signals.py	Signals: Split signal handler between sync and async	2024-09-19 23:31:29 -04:00
tabby_config.py	Cleanup config file loader (#208 )	2024-09-23 21:42:01 -04:00
templating.py	Tree: Fix classmethod usage	2024-09-10 20:52:29 -04:00
transformers_utils.py	Refactor the sampling class (#199 )	2024-10-27 11:43:41 -04:00
utils.py	Cleanup config file loader (#208 )	2024-09-23 21:42:01 -04:00