tabbyAPI-ollama/common
kingbri beb6d8faa5 Model: Adjust draft_gpu_split and add to config
The previous code overrode the existing gpu split and device idx
values. This now sets an independent draft_gpu_split value and
adjusts the gpu_devices check only if the draft_gpu_split array
is larger than the gpu_split array.

Draft gpu split is not Tensor Parallel, and defaults to gpu_split_auto
if a split is not provided.

Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
2025-02-08 16:09:46 -05:00
..
actions.py fix issues with optional dependencies (#204) 2024-09-19 22:24:55 -04:00
args.py Start: Fix startup with new argparser 2024-09-21 14:36:21 -04:00
auth.py Tree: Format 2024-09-18 20:36:17 -04:00
concurrency.py API + Model: Add blocks and checks for various load requests 2024-05-25 21:16:14 -04:00
config_models.py Model: Adjust draft_gpu_split and add to config 2025-02-08 16:09:46 -05:00
downloader.py config is now backed by pydantic (WIP) 2024-09-05 18:04:56 +01:00
gen_logging.py Logging: Remove preferences global 2024-09-14 21:49:44 -04:00
health.py Add health check monitoring for EXL2 errors (#206) 2024-09-22 21:40:36 -04:00
logger.py Logger: Add timestamps 2025-02-07 18:40:28 -05:00
model.py Model: Fix load packets 2024-11-21 18:06:47 -05:00
multimodal.py Tree: Format 2025-02-07 18:03:33 -05:00
networking.py remove unused imports 2024-09-11 18:00:29 +01:00
optional_dependencies.py Dependencies: Remove outlines from optional check 2024-12-18 11:56:40 -05:00
sampling.py Sampling: Add max_completion_tokens 2024-12-13 01:02:37 -05:00
signals.py Signals: Split signal handler between sync and async 2024-09-19 23:31:29 -04:00
tabby_config.py Cleanup config file loader (#208) 2024-09-23 21:42:01 -04:00
templating.py Tree: Fix classmethod usage 2024-09-10 20:52:29 -04:00
transformers_utils.py Refactor the sampling class (#199) 2024-10-27 11:43:41 -04:00
utils.py Cleanup config file loader (#208) 2024-09-23 21:42:01 -04:00