jalr/tabbyAPI-ollama

Author	SHA1	Message	Date
kingbri	64c2cc85c9	OAI: Migrate model depends into proper file Use amongst multiple routers. Signed-off-by: kingbri <bdashore3@proton.me>	2024-07-23 13:59:56 -04:00
kingbri	c7ce97f119	Tree: Ruff lint Signed-off-by: kingbri <bdashore3@proton.me>	2024-07-08 15:06:28 -04:00
kingbri	6613e38436	Main: Make openapi export store locally This runs faster than always making a syscall to check if the env var is set. Signed-off-by: kingbri <bdashore3@proton.me>	2024-07-08 14:54:06 -04:00
kingbri	ae66e8f9ba	Ruff: Lint Signed-off-by: kingbri <bdashore3@proton.me>	2024-07-08 13:44:12 -04:00
kingbri	b907421285	Main: Fix launch if EXPORT_OPENAPI is unset A default needs to be provided with getenv. Fix that with an empty string. Signed-off-by: kingbri <bdashore3@proton.me>	2024-07-08 13:41:44 -04:00
kingbri	933268f7e2	API: Integrate OpenAPI export script Move OpenAPI export as an env var within the main function. This allows for easy export by running main. In addition, an env variable provides global and explicit state to disable conditional wheel imports (ex. Exl2 and torch) which caused errors at first. Signed-off-by: kingbri <bdashore3@proton.me>	2024-07-08 12:34:32 -04:00
kingbri	27d2d5f3d2	Config + Model: Allow for default fallbacks from config for model loads Previously, the parameters under the "model" block in config.yml only handled the loading of a model on startup. This meant that any subsequent API request required each parameter to be filled out or use a sane default (usually defaults to the model's config.json). However, there are cases where admins may want an argument from the config to apply if the parameter isn't provided in the request body. To help alleviate this, add a mechanism that works like sampler overrides where users can specify a flag that acts as a fallback. Therefore, this change both preserves the source of truth of what parameters the admin is loading and adds some convenience for users that want customizable defaults for their requests. This behavior may change in the future, but I think it solves the issue for now. Signed-off-by: kingbri <bdashore3@proton.me>	2024-07-06 17:50:58 -04:00
turboderp	0eb8fa5d1e	[fix] Bring draft progress and model progress in sync with model loader (#125 ) * Bring draft progress and model progress in sync with model loader * Fix formatting	2024-06-03 19:41:02 +02:00
kingbri	43cd7f57e8	API + Model: Add blocks and checks for various load requests Add a sequential lock and wait until jobs are completed before executing any loading requests that directly alter the model. However, we also need to block any new requests that come in until the load is finished, so add a condition that triggers once the lock is free. Signed-off-by: kingbri <bdashore3@proton.me>	2024-05-25 21:16:14 -04:00
kingbri	6dfcbbd813	Common: Migrate request utils to networking Helps organize the project better. Utils is meant to be for simple functions like unwrap. Signed-off-by: kingbri <bdashore3@proton.me>	2024-03-21 23:21:57 -04:00
kingbri	95e44c20d6	Model: Fix load if model didn't load properly If the model didn't load properly, the container still exists until unload is called. However, the name check still registered as true. Signed-off-by: kingbri <bdashore3@proton.me>	2024-03-16 23:23:31 -04:00
kingbri	2755fd1af0	API: Fix blocking iterator execution Run these iterators on the background thread. On startup, the API spawns a background thread as needed to run sync code on without blocking the event loop. Use asyncio's run_thread function since it allows for errors to be propegated. Signed-off-by: kingbri <bdashore3@proton.me>	2024-03-16 23:23:31 -04:00
kingbri	7fded4f183	Tree: Switch to async generators Async generation helps remove many roadblocks to managing tasks using threads. It should allow for abortables and modern-day paradigms. NOTE: Exllamav2 itself is not an asynchronous library. It's just been added into tabby's async nature to allow for a fast and concurrent API server. It's still being debated to run stream_ex in a separate thread or manually manage it using asyncio.sleep(0) Signed-off-by: kingbri <bdashore3@proton.me>	2024-03-16 23:23:31 -04:00
kingbri	6f03be9523	API: Split functions into their own files Previously, generation function were bundled with the request function causing the overall code structure and API to look ugly and unreadable. Split these up and cleanup a lot of the methods that were previously overlooked in the API itself. Signed-off-by: kingbri <bdashore3@proton.me>	2024-03-12 23:59:30 -04:00
kingbri	5a2de30066	Tree: Update to cleanup globals Use the module singleton pattern to share global state. This can also be a modified version of the Global Object Pattern. The main reason this pattern is used is for ease of use when handling global state rather than adding extra dependencies for a DI parameter. Signed-off-by: kingbri <bdashore3@proton.me>	2024-03-12 23:59:30 -04:00
kingbri	b373b25235	API: Move to ModelManager This is a shared module which manages the model container and provides extra utility functions around it to help slim down the API. Signed-off-by: kingbri <bdashore3@proton.me>	2024-03-12 23:59:30 -04:00

16 commits