jalr/tabbyAPI-ollama

Author	SHA1	Message	Date
kingbri	7f6294a96d	Tree: Format Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>	2025-02-13 22:42:59 -05:00
mefich	3b482d80e4	Add strftime_now to Jijnja2 to use with Granite3 models Granite3 default template uses strftime_now function. Currently Jinja2 raises an exception because strftime_now is undefined and /v1/chat/completions endpoint doesn't work with these models when a template from the model metadata is used.	2025-02-13 18:08:24 +02:00
Brian	2e491472d1	Merge pull request #254 from lucyknada/main add draft_gpu_split option for spec decoding	2025-02-11 16:48:03 -05:00
kingbri	e290b88568	Args: Expose api-servers to subcommands This is required for the export-openapi action. Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>	2025-02-10 23:39:46 -05:00
kingbri	30ab8e04b9	Args: Add subcommands to run actions Migrate OpenAPI and sample config export to subcommands "export-openapi" and "export-config". Also add a "download" subcommand that passes args to the TabbyAPI downloader. This allows models to be downloaded via the API and CLI args. Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>	2025-02-10 23:14:22 -05:00
kingbri	30f02e5453	Main: Remove uvloop/winloop from experimental status Uvloop/Winloop does provide advantages to asyncio vs the standard Proactor loop, so remove experimental status. Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>	2025-02-10 21:30:48 -05:00
kingbri	beb6d8faa5	Model: Adjust draft_gpu_split and add to config The previous code overrode the existing gpu split and device idx values. This now sets an independent draft_gpu_split value and adjusts the gpu_devices check only if the draft_gpu_split array is larger than the gpu_split array. Draft gpu split is not Tensor Parallel, and defaults to gpu_split_auto if a split is not provided. Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>	2025-02-08 16:09:46 -05:00
kingbri	dcbf2de9e5	Logger: Add timestamps Was against this for a while due to the length of timestamps clogging the console, but it makes sense to know when something goes wrong. Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>	2025-02-07 18:40:28 -05:00
kingbri	54fda0dc09	Tree: Format Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>	2025-02-07 18:03:33 -05:00
kingbri	96e8375ec8	Multimodal: Fix memory leak with MMEmbeddings On a basic python class, class attributes are handled by reference, meaning that every instance of embeddings would attach to that reference and allocate more memory. Switch to a Pydantic class and factory methods when instantiating. Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>	2025-02-02 12:21:19 -05:00
kingbri	b579fd46b7	Dependencies: Remove outlines from optional check Outlines is no longer a dependency that's used in TabbyAPI. Signed-off-by: kingbri <8082010+bdashore3@users.noreply.github.com>	2024-12-18 11:56:40 -05:00
kingbri	c23e406f2d	Sampling: Add max_completion_tokens Conforms with OAI's updated spec Signed-off-by: kingbri <8082010+bdashore3@users.noreply.github.com>	2024-12-13 01:02:37 -05:00
kingbri	c49047eea1	Model: Fix load packets The model_type internal reference was changed to an enum for a more extendable loading process. Return the current model type when loading a new model. Signed-off-by: kingbri <bdashore3@proton.me>	2024-11-21 18:06:47 -05:00
kingbri	c652a6e030	API: Transform multimodal into an actual class Migrate the add method into the class itself. Also, a BaseModel isn't needed here since this isn't a serialized class. Signed-off-by: kingbri <bdashore3@proton.me>	2024-11-20 00:06:20 -05:00
kingbri	0fadb1e5e8	Merge branch 'main' into vision	2024-11-19 21:19:21 -05:00
DocShotgun	c42655336b	Config: Add option to disable fetching content from URLs	2024-11-17 23:05:17 -08:00
DocShotgun	dd41eec8a4	OAI: Initial vision support in OAI chat completions * Support image_url inputs containing URLs or base64 strings following OAI vision spec * Use async lru cache for image embeddings * Add generic wrapper class for multimodal embeddings	2024-11-17 21:23:09 -08:00
kingbri	bd9e78e19e	API: Add inline exception for dummy models If an API key sends a dummy model, it shouldn't error as the server is catering to clients that expect specific OAI model names. This is a problem with inline model loading since these names would error by default. Therefore, add an exception if the provided name is in the dummy model names (which also doubles as inline strict exceptions). However, the dummy model names weren't configurable, so add a new option to specify exception names, otherwise the default is gpt-3.5-turbo. Signed-off-by: kingbri <bdashore3@proton.me>	2024-11-17 21:15:45 -05:00
kingbri	69ac0eb8aa	Model: Add vision loading support Adds the ability to load vision parts of text + image models. Requires an explicit flag in config because there isn't a way to automatically determine whether the vision tower should be used. Signed-off-by: kingbri <bdashore3@proton.me>	2024-11-11 12:10:11 -05:00
DocShotgun	603760cecb	Model: Remove override_base_seq_len	2024-10-30 10:03:08 +08:00
TerminalMan	7d18d2e2ca	Refactor the sampling class (#199 ) * improve validation * remove to_gen_params functions * update changes for all endpoint types * OAI: Fix calls to generation Chat completion and completion need to have prompt split out before pushing to the backend. Signed-off-by: kingbri <bdashore3@proton.me> * Sampling: Convert Top-K values of -1 to 0 Some OAI implementations use -1 as disabled instead of 0. Therefore, add a coalesce case. Signed-off-by: kingbri <bdashore3@proton.me> * Sampling: Format and space out Make the code more readable. Signed-off-by: kingbri <bdashore3@proton.me> * Sampling: Fix mirostat Field items are nested in data within a Pydantic FieldInfo Signed-off-by: kingbri <bdashore3@proton.me> * Sampling: Format Signed-off-by: kingbri <bdashore3@proton.me> * Sampling: Fix banned_tokens and allowed_tokens conversion If the provided string has whitespace, trim it before splitting. Signed-off-by: kingbri <bdashore3@proton.me> * Sampling: Add helpful log to dry_sequence_breakers Let the user know if the sequence errors out. Signed-off-by: kingbri <bdashore3@proton.me> * Sampling: Apply validators in right order Validators need to be applied in order from top to bottom, this is why the after validator was not being applied properly. Set the model to validate default params for sampler override purposes. This can be turned off if there are unclear errors. Signed-off-by: kingbri <bdashore3@proton.me> * Endpoints: Format Cleanup and semantically fix field validators Signed-off-by: kingbri <bdashore3@proton.me> * Kobold: Update validators and fix parameter application Validators on parent fields cannot see child fields. Therefore, validate using the child fields instead and alter the parent field data from there. Also fix badwordsids casting. Signed-off-by: kingbri <bdashore3@proton.me> * Sampling: Remove validate defaults and fix mirostat If a user sets an override to a non-default value, that's their own fault. Run validator on the actual mirostat_mode parameter rather than the alternate mirostat parameter. Signed-off-by: kingbri <bdashore3@proton.me> * Kobold: Rework badwordsids Currently, this serves to ban the EOS token. All other functionality was legacy, so remove it. Signed-off-by: kingbri <bdashore3@proton.me> * Model: Remove HuggingfaceConfig This was only necessary for badwordsids. All other fields are handled by exl2. Keep the class as a stub if it's needed again. Signed-off-by: kingbri <bdashore3@proton.me> * Kobold: Bump kcpp impersonation TabbyAPI supports XTC now. Signed-off-by: kingbri <bdashore3@proton.me> * Sampling: Change alias to validation_alias Reduces the probability for errors and makes the class consistent. Signed-off-by: kingbri <bdashore3@proton.me> * OAI: Use constraints for validation Instead of adding a model_validator, use greater than or equal to constraints provided by Pydantic. Signed-off-by: kingbri <bdashore3@proton.me> * Tree: Lint Signed-off-by: kingbri <bdashore3@proton.me> --------- Co-authored-by: SecretiveShell <84923604+SecretiveShell@users.noreply.github.com> Co-authored-by: kingbri <bdashore3@proton.me>	2024-10-27 11:43:41 -04:00
kingbri	126a44483c	Tree: Remove fasttensors Now a noop in upstream. Signed-off-by: kingbri <bdashore3@proton.me>	2024-09-30 00:18:47 -04:00
kingbri	56ce82ef77	Sampling: Add XTC support Matches with upstream. Signed-off-by: kingbri <bdashore3@proton.me>	2024-09-24 18:10:52 -04:00
TerminalMan	f4791e7ed9	Cleanup config file loader (#208 ) * fix config file loader * prune nonetype values from config dict fixes default values not initialising properly * Utils: Shrink None removal function It is more concise to use a list and dict collection if necessary rather than iterating through and checking each value. Tested and works with Tabby's cases. Signed-off-by: kingbri <bdashore3@proton.me> --------- Signed-off-by: kingbri <bdashore3@proton.me> Co-authored-by: kingbri <bdashore3@proton.me>	2024-09-23 21:42:01 -04:00
TerminalMan	2cda890deb	Add health check monitoring for EXL2 errors (#206 ) * Add health check monitoring for EXL2 errors * Health: Format and change status code A status code of 503 makes more sense to use. ---------	2024-09-22 21:40:36 -04:00
kingbri	e0ffa90865	Dependencies: Change handling of exllamav2 checks ExllamaV2 should check for solely exllamav2, otherwise errors don't make sense. Migrate the combined "exl2" computed property to "inference" since those are the required dependencies for minimal inference. Signed-off-by: kingbri <bdashore3@proton.me>	2024-09-22 12:57:28 -04:00
kingbri	3c8384ee71	Start: Fix startup with new argparser Since the full argparser requires pydantic, gate it until all dependencies are installed. Also if the venv is deleted, assume that start_options.json is invalid as well. Signed-off-by: kingbri <bdashore3@proton.me>	2024-09-21 14:36:21 -04:00
kingbri	d5e4285346	Signals: Split signal handler between sync and async Asyncio requires a closure of the event loop while sync can use SystemExit to kill the program. Signed-off-by: kingbri <bdashore3@proton.me>	2024-09-19 23:31:29 -04:00
TerminalMan	3aeddc5255	fix issues with optional dependencies (#204 ) * fix issues with optional dependencies * format document * Tree: Format and comment	2024-09-19 22:24:55 -04:00
kingbri	b30336c75b	Tree: Format Signed-off-by: kingbri <bdashore3@proton.me>	2024-09-18 21:42:01 -04:00
kingbri	edf3a00310	Config: Make API server literals case insensitive There's no native way to handle case insensitivity in pydantic, so add a validator which converts the API server input to be lowercase. Signed-off-by: kingbri <bdashore3@proton.me>	2024-09-18 21:39:18 -04:00
kingbri	2fd02cf4fc	Startup actions: Add openapi var check This is required to exit once the openapi spec is created. Signed-off-by: kingbri <bdashore3@proton.me>	2024-09-18 21:08:45 -04:00
kingbri	4cf85514f7	Tree: Format Signed-off-by: kingbri <bdashore3@proton.me>	2024-09-18 20:36:17 -04:00
kingbri	24ea85b3c5	Tree: Use safe loader for YAML Loaders that read use a safe type while loaders that write use both round-trip and safe options. Also don't create module-level parsers where they're not needed. Signed-off-by: kingbri <bdashore3@proton.me>	2024-09-18 19:26:51 -04:00
TerminalMan	6c7542de9f	migrate all yaml loaders to ruamel.yaml	2024-09-18 11:33:15 +01:00
kingbri	63634beb5e	Config: Clarify Rope alpha options Leaving blank will use the model's set value or auto-calculate. Signed-off-by: kingbri <bdashore3@proton.me>	2024-09-17 23:03:28 -04:00
kingbri	754fb15f23	Config: Fix draft model migration and loading The loader takes in the "draft" parameter, so map the config model to that when creating kwargs for initial load. Also map the old "draft" key to the new "draft_model" key. Signed-off-by: kingbri <bdashore3@proton.me>	2024-09-17 22:48:56 -04:00
kingbri	a34bd9a684	Config: Alter YAML generation script for formatting adherence Properly add comments and newlines where they need to go. Signed-off-by: kingbri <bdashore3@proton.me>	2024-09-17 22:44:42 -04:00
TerminalMan	948fcb7f5b	migrate to ruamel.yaml	2024-09-18 01:06:34 +01:00
TerminalMan	bb4dd7200e	fix defaults for api_servers	2024-09-17 15:41:32 +01:00
kingbri	63f8c46a92	Config: Make a better description for lora config This is not ideal because users may still have trouble understanding what a lora includes, but adding an example comment will help instead of leaving a blank line. Signed-off-by: kingbri <bdashore3@proton.me>	2024-09-16 23:29:39 -04:00
kingbri	852ea8faaa	Config: Don't load from file if actions present Loading from file adds extra overhead for actions that don't rely on file loading. Signed-off-by: kingbri <bdashore3@proton.me>	2024-09-16 23:29:07 -04:00
kingbri	ececce172e	Config: Fix addition of preamble Remove the extraneous newlines from the beginning of the preamble. Signed-off-by: kingbri <bdashore3@proton.me>	2024-09-16 23:06:01 -04:00
kingbri	f6fb60a6ed	Config: Inline model loading is False This is not a True default. Signed-off-by: kingbri <bdashore3@proton.me>	2024-09-16 22:54:35 -04:00
kingbri	46f9fff210	Config: Move config file generation to tabby_config Keep the models as a separate reference file without any extra functions. Signed-off-by: kingbri <bdashore3@proton.me>	2024-09-16 22:22:24 -04:00
kingbri	d2d07ed92d	Config: Update auto-migration flow - Let the user know that migration is going to be attempted - Have a more informative error message if auto-migration fails - Revert back to the old config file on failure - Don't load with a partially parsed config Signed-off-by: kingbri <bdashore3@proton.me>	2024-09-16 18:15:50 -04:00
kingbri	ebe7f3567e	Config: Alter migration error handling and cleanup Rollback to the old config if automigration fails. Signed-off-by: kingbri <bdashore3@proton.me>	2024-09-16 18:02:18 -04:00
kingbri	e60c4ba5bc	Config: Fix existing value check If a sub-field exists in the model provided to the file generator, use it. Otherwise always fallback to the default factory. This prevents any subsequent errors from setting None. Signed-off-by: kingbri <bdashore3@proton.me>	2024-09-16 17:51:40 -04:00
kingbri	c715094cdc	Config: Add logging config to migration checks These keys were changed as well to include a "log_" prefix like the CLI arguments. Signed-off-by: kingbri <bdashore3@proton.me>	2024-09-16 12:35:33 -04:00
kingbri	81ae461eb8	Config: Allow existing values to get included in generated file Allows for generation from an existing config file. Primarily used for migration purposes. Signed-off-by: kingbri <bdashore3@proton.me>	2024-09-16 12:19:58 -04:00

1 2 3 4 5

239 commits