jalr/tabbyAPI-ollama

Author	SHA1	Message	Date
kingbri	f070587e9f	Model: Add proper jobs cleanup and fix var calls Jobs should be started and immediately cleaned up when calling the generation stream. Expose a stream_generate function and append this to the base class since it's more idiomatic than generate_gen. The exl2 container's generate_gen function is now internal. Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>	2025-04-24 21:30:55 -04:00
kingbri	f15ac1f69d	Model: Reject model requests when unloading If a model is being unloaded, that means its being shut down and no requests should be accepted from then on. Also, remove model_is_loaded since we simply check if the container is None now. Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>	2025-04-19 22:34:06 -04:00
kingbri	aa4ccd03d4	Infinity: Use a runtime type hint for engine Remove the antipattern of the conditional type for the Async engine and use string-based type inference. Signed-off-by: kingbri <bdashore3@proton.me>	2024-11-22 18:06:08 -05:00
TerminalMan	3aeddc5255	fix issues with optional dependencies (#204 ) * fix issues with optional dependencies * format document * Tree: Format and comment	2024-09-19 22:24:55 -04:00
Jake	42a42caf43	remove logging - remove logging statements - format code with ruff	2024-09-04 16:14:09 +01:00
TerminalMan	43104e0d19	Complete conditional infinity import TODO - add logging - change declaration order	2024-08-31 21:48:43 +01:00
kingbri	dc3dcc9c0d	Embeddings: Update config, args, and parameter names Use embeddings_device as the parameter for device to remove ambiguity. Signed-off-by: kingbri <bdashore3@proton.me>	2024-07-30 15:32:26 -04:00
kingbri	f13d0fb8b3	Embeddings: Add model load checks Same as the normal model container. Signed-off-by: kingbri <bdashore3@proton.me>	2024-07-30 11:17:36 -04:00
kingbri	01c7702859	Signal: Fix async signal handling Run unload async functions before exiting the program. Signed-off-by: kingbri <bdashore3@proton.me>	2024-07-30 11:11:05 -04:00
kingbri	fbf1455db1	Embeddings: Migrate and organize Infinity Use Infinity as a separate backend and handle the model within the common module. This separates out the embeddings model from the endpoint which allows for model loading/unloading in core. Signed-off-by: kingbri <bdashore3@proton.me>	2024-07-30 11:00:23 -04:00

10 commits