tabbyAPI-ollama/endpoints
kingbri 9f649647f0 Model + API: GPU split updates and fixes
For the TP loader, GPU split cannot be an empty array. However,
defaulting the parameter to an empty array makes it easier to calculate
the device list. Therefore, cast an empty array to None using
falsy comparisons at load time.

Also add draft_gpu_split to the load request.

Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
2025-02-15 21:50:14 -05:00
..
core Model + API: GPU split updates and fixes 2025-02-15 21:50:14 -05:00
Kobold Dependencies: Update sse-starlette and formatron 2024-12-21 23:14:55 -05:00
OAI Embeddings: Fix base64 return 2025-01-01 16:15:12 -05:00
server.py Args: Expose api-servers to subcommands 2025-02-10 23:39:46 -05:00