tabbyAPI-ollama

History

kingbri b0c295dd2f API: Add more methods to semaphore The semaphore/queue model for Tabby is as follows: - Any load requests go through the semaphore by default - Any load request can include the skip_queue parameter to bypass the semaphore - Any unload requests are immediately executed - All completion requests are placed inside the semaphore by default This model preserves the parallelism of single-user mode with extra convenience methods for queues in multi-user. It also helps mitigate problems that were previously present in the concurrency stack. Also change how the program's loop runs so it exits when the API thread dies. Signed-off-by: kingbri <bdashore3@proton.me>		2024-03-04 23:21:40 -05:00
..
chat_completion.py	API: Add logprobs for chat completions	2024-02-08 21:26:53 -05:00
common.py	Model: Add logprobs support	2024-02-08 21:26:53 -05:00
completion.py	Model: Add logprobs support	2024-02-08 21:26:53 -05:00
lora.py	API: Add more methods to semaphore	2024-03-04 23:21:40 -05:00
model.py	API: Add more methods to semaphore	2024-03-04 23:21:40 -05:00
sampler_overrides.py	API: Add sampler override switching	2024-01-25 00:15:40 -05:00
template.py	API: Add template switching and unload endpoints	2024-01-25 00:15:40 -05:00
token.py	Tree: Refactor code organization	2024-01-25 00:15:40 -05:00