jalr/tabbyAPI-ollama

Author	SHA1	Message	Date
kingbri	14dfaf600a	Args: Add request logging Signed-off-by: kingbri <bdashore3@proton.me>	2024-07-22 21:41:42 -04:00
kingbri	3826815edb	API: Add request logging Log all the parts of a request if the config flag is set. The logged fields are all server side anyways, so nothing is being exposed to clients. Signed-off-by: kingbri <bdashore3@proton.me>	2024-07-22 21:40:00 -04:00
kingbri	522999ebb4	Config: Change from gen_logging to logging More accurately reflects the config.yml's sections. Signed-off-by: kingbri <bdashore3@proton.me>	2024-07-22 21:15:16 -04:00
kingbri	15f891b277	Args: Update to latest config.yml Fix order of params to follow the same flow as config.yml Signed-off-by: kingbri <bdashore3@proton.me>	2024-07-22 16:26:41 -04:00
kingbri	0eedc8ca14	API: Switch from request ID middleware to depends Middleware runs on both the request and response. Therefore, streaming responses had increased latency when processing tasks and sending data to the client which resulted in erratic streaming behavior. Use a depends to add request IDs since it only executes when the request is run rather than expecting the response to be sent as well. For the future, it would be best to think about limiting the time between each tick of chunk data to be safe. Signed-off-by: kingbri <bdashore3@proton.me>	2024-07-22 12:19:46 -04:00
kingbri	cae94b920c	API: Add ability to use request IDs Identify which request is being processed to help users disambiguate which logs correspond to which request. Signed-off-by: kingbri <bdashore3@proton.me>	2024-07-21 21:01:05 -04:00
kingbri	38185a1ff4	Auth: Fix key check coalesce Prefer the auth-specific headers before the generic authorization header. Signed-off-by: kingbri <bdashore3@proton.me>	2024-07-19 10:08:57 -04:00
kingbri	e20a2d504b	API: Fix pydantic validation errors on disconnect poll returns Raise a 422 exception for the disconnect. This prevents pydantic errors when returning a "response" which doesn't contain anything in this case. Signed-off-by: kingbri <bdashore3@proton.me>	2024-07-15 14:41:49 -04:00
kingbri	6019c93637	Networking: Gate sending tracebacks over the API It's possible that tracebacks can give too much info about a system when sent over the API. Gate this under a flag to send them only when debugging since this feature is still useful. Signed-off-by: kingbri <bdashore3@proton.me>	2024-07-14 10:30:11 -04:00
kingbri	1f46a1130c	OAI: Restrict list permissions for API keys API keys are not allowed to view all the admin's models, templates, draft models, loras, etc. Basically anything that can be viewed on the filesystem outside of anything that's currently loaded is not allowed to be returned unless an admin key is present. This change helps preserve user privacy while not erroring out on list endpoints that the OAI spec requires. Signed-off-by: kingbri <bdashore3@proton.me>	2024-07-11 14:22:50 -04:00
kingbri	10890913b8	Auth: Revert x-admin-key allowance in API key check These kinda clash with each other. Use the correct header for the correct endpoint. Signed-off-by: kingbri <bdashore3@proton.me>	2024-07-11 14:22:50 -04:00
kingbri	b9a58ff01b	Auth: Make key permission check work on Requests Pass a request and internally unwrap the headers. In addition, allow X-admin-key to get checked in an API key request. Signed-off-by: kingbri <bdashore3@proton.me>	2024-07-11 14:22:49 -04:00
kingbri	c7ce97f119	Tree: Ruff lint Signed-off-by: kingbri <bdashore3@proton.me>	2024-07-08 15:06:28 -04:00
kingbri	6613e38436	Main: Make openapi export store locally This runs faster than always making a syscall to check if the env var is set. Signed-off-by: kingbri <bdashore3@proton.me>	2024-07-08 14:54:06 -04:00
kingbri	ae66e8f9ba	Ruff: Lint Signed-off-by: kingbri <bdashore3@proton.me>	2024-07-08 13:44:12 -04:00
kingbri	b907421285	Main: Fix launch if EXPORT_OPENAPI is unset A default needs to be provided with getenv. Fix that with an empty string. Signed-off-by: kingbri <bdashore3@proton.me>	2024-07-08 13:41:44 -04:00
kingbri	933268f7e2	API: Integrate OpenAPI export script Move OpenAPI export as an env var within the main function. This allows for easy export by running main. In addition, an env variable provides global and explicit state to disable conditional wheel imports (ex. Exl2 and torch) which caused errors at first. Signed-off-by: kingbri <bdashore3@proton.me>	2024-07-08 12:34:32 -04:00
kingbri	27d2d5f3d2	Config + Model: Allow for default fallbacks from config for model loads Previously, the parameters under the "model" block in config.yml only handled the loading of a model on startup. This meant that any subsequent API request required each parameter to be filled out or use a sane default (usually defaults to the model's config.json). However, there are cases where admins may want an argument from the config to apply if the parameter isn't provided in the request body. To help alleviate this, add a mechanism that works like sampler overrides where users can specify a flag that acts as a fallback. Therefore, this change both preserves the source of truth of what parameters the admin is loading and adds some convenience for users that want customizable defaults for their requests. This behavior may change in the future, but I think it solves the issue for now. Signed-off-by: kingbri <bdashore3@proton.me>	2024-07-06 17:50:58 -04:00
turboderp	0eb8fa5d1e	[fix] Bring draft progress and model progress in sync with model loader (#125 ) * Bring draft progress and model progress in sync with model loader * Fix formatting	2024-06-03 19:41:02 +02:00
DocShotgun	7084081b1f	Tree: Lint	2024-05-26 18:27:30 -07:00
DocShotgun	ce5e2ec8de	Logging: Clarify new vs cached tokens in prompt processing	2024-05-26 18:21:17 -07:00
DocShotgun	7ab7ffd562	Tree: Format	2024-05-26 15:48:18 -07:00
DocShotgun	767e6a798a	API + Model: Add support for specifying k/v cache size	2024-05-26 14:17:01 -07:00
kingbri	9fbbc5afca	Tree: Swap from map to list comprehensions List comprehensions are the more "pythonic" way to approach mapping values to a list. They're also more flexible across different collection types rather than the inbuilt map method. It's best to keep one convention rather than splitting down two. Signed-off-by: kingbri <bdashore3@proton.me>	2024-05-25 21:16:14 -04:00
kingbri	43cd7f57e8	API + Model: Add blocks and checks for various load requests Add a sequential lock and wait until jobs are completed before executing any loading requests that directly alter the model. However, we also need to block any new requests that come in until the load is finished, so add a condition that triggers once the lock is free. Signed-off-by: kingbri <bdashore3@proton.me>	2024-05-25 21:16:14 -04:00
kingbri	06ff47e2b4	Model: Use true async jobs and add logprobs The new async dynamic job allows for native async support without the need of threading. Also add logprobs and metrics back to responses. Signed-off-by: kingbri <bdashore3@proton.me>	2024-05-25 21:16:14 -04:00
kingbri	c474076b22	Concurrency: Remove release_semaphore method At any point for any request cancellation, the semaphore will be decremented. This is an issue since an arbitrary request can desync the semaphore, causing multiple tasks to be processed at once and break generation. Remove this from the networking handlers and therefore, remove the release_semaphore function itself. Signed-off-by: kingbri <bdashore3@proton.me>	2024-05-19 10:42:26 -04:00
kingbri	b9fd8555fe	Sampling: Copy over iterable overrides If an override was iterable, any modifications to the returned value would alter the reference to the global storage dict. Therefore, copy the structure if it's an iterable so any modification won't alter the original override. Also apply this for the function that checks for forced overrides. Signed-off-by: kingbri <bdashore3@proton.me>	2024-05-17 21:38:28 -04:00
DocShotgun	abe411c6fb	API + Model: Add support for regex pattern constraints Adds the ability to constrain generation via regex pattern using lm-format-enforcer.	2024-05-12 19:10:43 -07:00
DocShotgun	9463ecfa40	Samplers: Minor fixes for sampler override * Add missing settings to sample_preset.yml * Fix override for skip_special_tokens	2024-05-12 00:31:31 -07:00
kingbri	c8ec742be9	Samplers: Expose skew sampling Skew is an extra unused sampler in ExllamaV2. Add it in for coverage. Signed-off-by: kingbri <bdashore3@proton.me>	2024-05-12 01:41:01 -04:00
kingbri	6f4012d20d	API: Add preset listing for sampler overrides Querying the overrides list endpoint now returns the selected preset and a list of presets to use. Signed-off-by: kingbri <bdashore3@proton.me>	2024-05-12 01:34:51 -04:00
DocShotgun	c0b631ba92	API: Add banned_strings From exllamav2: List of strings that the generator will refuse to output. As soon as a partial match happens, a checkpoint is saved that the generator can rewind to if need be. Subsequent tokens are then held until the full string is resolved (match or no match) and either emitted or discarded, accordingly.	2024-05-10 13:53:55 -07:00
DocShotgun	a1df22668b	API: Add min_tokens Bans the EOS token until the generation reaches a minimum length. This will not prevent the model from otherwise ending the generation early by outputting other stop conditions.	2024-05-10 12:30:17 -07:00
kingbri	ab526f7278	Revert "API: Remove unncessary Optional signatures" This reverts commit `7556dcf134`. The Optionals allowed requests to send "null" in the body for optional parameters which should be allowed. Signed-off-by: kingbri <bdashore3@proton.me>	2024-05-02 21:23:48 -04:00
kingbri	7556dcf134	API: Remove unncessary Optional signatures Optional isn't necessary if the function signature has a default value. Signed-off-by: kingbri <bdashore3@proton.me>	2024-05-01 00:04:52 -04:00
kingbri	ae75db1829	Downloader: Cleanup on exception Otherwise a file exists error will show up if any exception happens but cancel. Signed-off-by: kingbri <bdashore3@proton.me>	2024-04-30 23:26:22 -04:00
kingbri	e4084b15c1	Downloader: Format Make a public function private. Signed-off-by: kingbri <bdashore3@proton.me>	2024-04-30 01:16:57 -04:00
kingbri	50e0b71690	Downloader: Fix handling of include pattern If an include or exclude pattern is provided, include should include all files by default. Signed-off-by: kingbri <bdashore3@proton.me>	2024-04-30 01:13:06 -04:00
kingbri	21a01741c9	Downloader: Add include and exclude parameters These both take an array of glob strings to state what files or directories to include or exclude when parsing the download list. Signed-off-by: kingbri <bdashore3@proton.me>	2024-04-30 00:58:54 -04:00
kingbri	c47869c606	Downloader: Fix fallback mechanisms Use None-ish coalescing instead of unwrap optional handling. This means that any value that is "empty" for python will default to the fallback. Ex. print("" or "test") will print out "test" Signed-off-by: kingbri <bdashore3@proton.me>	2024-04-29 23:33:37 -04:00
kingbri	55ccd1baad	API: Add HuggingFace downloader Adds an asynchronous huggingface downloader that uses HF hub to fetch all repo files. The current HF hub package has a snapshot_download function that does not cancel on KeyboardInterrupt. Instead, make a downloader that uses the Rich progress bar styling along with a cancellable interface. Finally, link this to TabbyAPI. Signed-off-by: kingbri <bdashore3@proton.me>	2024-04-29 01:15:02 -04:00
kingbri	6114bfd221	API: Fix banned_tokens string when empty The string should not be parsed and any non-string elements should be removed as well. Signed-off-by: kingbri <bdashore3@proton.me>	2024-04-28 12:46:28 -04:00
kingbri	6f9da97114	API: Add banned_tokens Appends the banned tokens to the generation. This is equivalent of setting logit bias to -100 on a specific set of tokens. Signed-off-by: kingbri <bdashore3@proton.me>	2024-04-28 11:06:09 -04:00
kingbri	ed7cd3cb59	Network: Fix socket check timeout Make this a one second timeout to check if a socket is connected. Signed-off-by: kingbri <bdashore3@proton.me>	2024-04-22 21:33:41 -04:00
kingbri	cab789e685	Templates: Migrate to class Having many utility functions for initialization doesn't make much sense. Instead, handle anything regarding template creation inside the class which reduces the amount of function imports. Signed-off-by: kingbri <bdashore3@proton.me>	2024-04-21 23:28:14 -04:00
kingbri	9f93505bc1	OAI: Add skip_special_tokens parameter Allows the ability to decode special tokens if the user wishes. Signed-off-by: kingbri <bdashore3@proton.me>	2024-04-21 00:37:46 -04:00
kingbri	67f061859d	Tree: Add transformers_utils Part of commit `8824ea0205` Signed-off-by: kingbri <bdashore3@proton.me>	2024-04-20 00:07:39 -04:00
kingbri	46ac3beea9	Templates: Support list style chat_template keys HuggingFace updated transformers to provide templates in a list for tokenizers. Update to support this new format. Providing the name of a template for the "prompt_template" value in config.yml will also look inside the template list. In addition, log if there's a template exception, but continue model loading since it shouldn't shut down the application. Signed-off-by: kingbri <bdashore3@proton.me>	2024-04-07 11:20:25 -04:00
Brian Dashore	cdb96e4f74	Merge pull request #93 from AlpinDale/chore/log-level chore: make log level configurable via env variable	2024-04-02 00:52:06 -04:00

... 3 4 5 6 7

312 commits