Commit graph

275 commits

Author SHA1 Message Date
kingbri
0fadb1e5e8 Merge branch 'main' into vision 2024-11-19 21:19:21 -05:00
DocShotgun
c42655336b Config: Add option to disable fetching content from URLs 2024-11-17 23:05:17 -08:00
DocShotgun
dd41eec8a4 OAI: Initial vision support in OAI chat completions
* Support image_url inputs containing URLs or base64 strings following OAI vision spec
* Use async lru cache for image embeddings
* Add generic wrapper class for multimodal embeddings
2024-11-17 21:23:09 -08:00
kingbri
bd9e78e19e API: Add inline exception for dummy models
If an API key sends a dummy model, it shouldn't error as the server
is catering to clients that expect specific OAI model names. This
is a problem with inline model loading since these names would error
by default. Therefore, add an exception if the provided name is in the
dummy model names (which also doubles as inline strict exceptions).

However, the dummy model names weren't configurable, so add a new
option to specify exception names, otherwise the default is gpt-3.5-turbo.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-11-17 21:15:45 -05:00
kingbri
69ac0eb8aa Model: Add vision loading support
Adds the ability to load vision parts of text + image models. Requires
an explicit flag in config because there isn't a way to automatically
determine whether the vision tower should be used.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-11-11 12:10:11 -05:00
DocShotgun
603760cecb Model: Remove override_base_seq_len 2024-10-30 10:03:08 +08:00
TerminalMan
7d18d2e2ca
Refactor the sampling class (#199)
* improve validation

* remove to_gen_params functions

* update changes for all endpoint types

* OAI: Fix calls to generation

Chat completion and completion need to have prompt split out before
pushing to the backend.

Signed-off-by: kingbri <bdashore3@proton.me>

* Sampling: Convert Top-K values of -1 to 0

Some OAI implementations use -1 as disabled instead of 0. Therefore,
add a coalesce case.

Signed-off-by: kingbri <bdashore3@proton.me>

* Sampling: Format and space out

Make the code more readable.

Signed-off-by: kingbri <bdashore3@proton.me>

* Sampling: Fix mirostat

Field items are nested in data within a Pydantic FieldInfo

Signed-off-by: kingbri <bdashore3@proton.me>

* Sampling: Format

Signed-off-by: kingbri <bdashore3@proton.me>

* Sampling: Fix banned_tokens and allowed_tokens conversion

If the provided string has whitespace, trim it before splitting.

Signed-off-by: kingbri <bdashore3@proton.me>

* Sampling: Add helpful log to dry_sequence_breakers

Let the user know if the sequence errors out.

Signed-off-by: kingbri <bdashore3@proton.me>

* Sampling: Apply validators in right order

Validators need to be applied in order from top to bottom, this is why
the after validator was not being applied properly.

Set the model to validate default params for sampler override purposes.
This can be turned off if there are unclear errors.

Signed-off-by: kingbri <bdashore3@proton.me>

* Endpoints: Format

Cleanup and semantically fix field validators

Signed-off-by: kingbri <bdashore3@proton.me>

* Kobold: Update validators and fix parameter application

Validators on parent fields cannot see child fields. Therefore,
validate using the child fields instead and alter the parent field
data from there.

Also fix badwordsids casting.

Signed-off-by: kingbri <bdashore3@proton.me>

* Sampling: Remove validate defaults and fix mirostat

If a user sets an override to a non-default value, that's their
own fault.

Run validator on the actual mirostat_mode parameter rather than
the alternate mirostat parameter.

Signed-off-by: kingbri <bdashore3@proton.me>

* Kobold: Rework badwordsids

Currently, this serves to ban the EOS token. All other functionality
was legacy, so remove it.

Signed-off-by: kingbri <bdashore3@proton.me>

* Model: Remove HuggingfaceConfig

This was only necessary for badwordsids. All other fields are handled
by exl2. Keep the class as a stub if it's needed again.

Signed-off-by: kingbri <bdashore3@proton.me>

* Kobold: Bump kcpp impersonation

TabbyAPI supports XTC now.

Signed-off-by: kingbri <bdashore3@proton.me>

* Sampling: Change alias to validation_alias

Reduces the probability for errors and makes the class consistent.

Signed-off-by: kingbri <bdashore3@proton.me>

* OAI: Use constraints for validation

Instead of adding a model_validator, use greater than or equal to
constraints provided by Pydantic.

Signed-off-by: kingbri <bdashore3@proton.me>

* Tree: Lint

Signed-off-by: kingbri <bdashore3@proton.me>

---------

Co-authored-by: SecretiveShell <84923604+SecretiveShell@users.noreply.github.com>
Co-authored-by: kingbri <bdashore3@proton.me>
2024-10-27 11:43:41 -04:00
kingbri
126a44483c Tree: Remove fasttensors
Now a noop in upstream.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-30 00:18:47 -04:00
kingbri
56ce82ef77 Sampling: Add XTC support
Matches with upstream.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-24 18:10:52 -04:00
TerminalMan
f4791e7ed9
Cleanup config file loader (#208)
* fix config file loader

* prune nonetype values from config dict

fixes default values not initialising properly

* Utils: Shrink None removal function

It is more concise to use a list and dict collection if necessary
rather than iterating through and checking each value. Tested and
works with Tabby's cases.

Signed-off-by: kingbri <bdashore3@proton.me>

---------

Signed-off-by: kingbri <bdashore3@proton.me>
Co-authored-by: kingbri <bdashore3@proton.me>
2024-09-23 21:42:01 -04:00
TerminalMan
2cda890deb
Add health check monitoring for EXL2 errors (#206)
* Add health check monitoring for EXL2 errors

* Health: Format and change status code

A status code of 503 makes more sense to use.
---------
2024-09-22 21:40:36 -04:00
kingbri
e0ffa90865 Dependencies: Change handling of exllamav2 checks
ExllamaV2 should check for solely exllamav2, otherwise errors don't
make sense. Migrate the combined "exl2" computed property to "inference"
since those are the required dependencies for minimal inference.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-22 12:57:28 -04:00
kingbri
3c8384ee71 Start: Fix startup with new argparser
Since the full argparser requires pydantic, gate it until all dependencies
are installed.

Also if the venv is deleted, assume that start_options.json is invalid
as well.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-21 14:36:21 -04:00
kingbri
d5e4285346 Signals: Split signal handler between sync and async
Asyncio requires a closure of the event loop while sync can use SystemExit
to kill the program.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-19 23:31:29 -04:00
TerminalMan
3aeddc5255
fix issues with optional dependencies (#204)
* fix issues with optional dependencies

* format document

* Tree: Format and comment
2024-09-19 22:24:55 -04:00
kingbri
b30336c75b Tree: Format
Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-18 21:42:01 -04:00
kingbri
edf3a00310 Config: Make API server literals case insensitive
There's no native way to handle case insensitivity in pydantic, so
add a validator which converts the API server input to be lowercase.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-18 21:39:18 -04:00
kingbri
2fd02cf4fc Startup actions: Add openapi var check
This is required to exit once the openapi spec is created.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-18 21:08:45 -04:00
kingbri
4cf85514f7 Tree: Format
Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-18 20:36:17 -04:00
kingbri
24ea85b3c5 Tree: Use safe loader for YAML
Loaders that read use a safe type while loaders that write use both
round-trip and safe options.

Also don't create module-level parsers where they're not needed.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-18 19:26:51 -04:00
TerminalMan
6c7542de9f migrate all yaml loaders to ruamel.yaml 2024-09-18 11:33:15 +01:00
kingbri
63634beb5e Config: Clarify Rope alpha options
Leaving blank will use the model's set value or auto-calculate.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-17 23:03:28 -04:00
kingbri
754fb15f23 Config: Fix draft model migration and loading
The loader takes in the "draft" parameter, so map the config model
to that when creating kwargs for initial load.

Also map the old "draft" key to the new "draft_model" key.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-17 22:48:56 -04:00
kingbri
a34bd9a684 Config: Alter YAML generation script for formatting adherence
Properly add comments and newlines where they need to go.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-17 22:44:42 -04:00
TerminalMan
948fcb7f5b migrate to ruamel.yaml 2024-09-18 01:06:34 +01:00
TerminalMan
bb4dd7200e fix defaults for api_servers 2024-09-17 15:41:32 +01:00
kingbri
63f8c46a92 Config: Make a better description for lora config
This is not ideal because users may still have trouble understanding
what a lora includes, but adding an example comment will help instead
of leaving a blank line.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-16 23:29:39 -04:00
kingbri
852ea8faaa Config: Don't load from file if actions present
Loading from file adds extra overhead for actions that don't rely
on file loading.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-16 23:29:07 -04:00
kingbri
ececce172e Config: Fix addition of preamble
Remove the extraneous newlines from the beginning of the preamble.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-16 23:06:01 -04:00
kingbri
f6fb60a6ed Config: Inline model loading is False
This is not a True default.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-16 22:54:35 -04:00
kingbri
46f9fff210 Config: Move config file generation to tabby_config
Keep the models as a separate reference file without any extra
functions.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-16 22:22:24 -04:00
kingbri
d2d07ed92d Config: Update auto-migration flow
- Let the user know that migration is going to be attempted
- Have a more informative error message if auto-migration fails
- Revert back to the old config file on failure
- Don't load with a partially parsed config

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-16 18:15:50 -04:00
kingbri
ebe7f3567e Config: Alter migration error handling and cleanup
Rollback to the old config if automigration fails.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-16 18:02:18 -04:00
kingbri
e60c4ba5bc Config: Fix existing value check
If a sub-field exists in the model provided to the file generator,
use it. Otherwise always fallback to the default factory. This prevents
any subsequent errors from setting None.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-16 17:51:40 -04:00
kingbri
c715094cdc Config: Add logging config to migration checks
These keys were changed as well to include a "log_" prefix like the
CLI arguments.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-16 12:35:33 -04:00
kingbri
81ae461eb8 Config: Allow existing values to get included in generated file
Allows for generation from an existing config file. Primarily used
for migration purposes.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-16 12:19:58 -04:00
TerminalMan
7f03003437 rephrase info message 2024-09-16 14:18:54 +01:00
TerminalMan
564bdcf0a8 add legacy config converter 2024-09-16 14:12:47 +01:00
kingbri
b6dd21f737 Config: Handle default factories in config generation
Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-16 00:55:46 -04:00
kingbri
3340c3bf2f Config: Rewrite descriptions
This makes both config.yml and args more descriptive than before.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-16 00:55:14 -04:00
kingbri
4c8bb42ec1 Config: Reorder models
It makes sense for the LLM model groups to be clustered around
each other with the least used groups towards the bottom.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-16 00:55:14 -04:00
kingbri
8ff9f2c6c0 Config: Rewrite docstrings for models
Adheres to the old config.yml's descriptions and allows for newlines
in generated YAML.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-16 00:55:14 -04:00
kingbri
250d76f5c6 Config: Alter YAML generator function
These changes fix the amount and order of newlines to look pleasing
for the user. However, the changes used in here are kind of hacky
and need a proper fix that can contain the same level of efficiency.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-16 00:55:11 -04:00
TerminalMan
92af656705 improve config generation action 2024-09-15 17:50:37 +01:00
kingbri
5bfa952671 Actions: Format
Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-14 22:05:11 -04:00
kingbri
d013729b7d Config: Add aliases for logging config
Config.yml and args take in two different values.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-14 21:56:16 -04:00
kingbri
6f28cfe905 Logging: Remove preferences global
This is no longer needed because config is a singleton.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-14 21:49:44 -04:00
kingbri
a09dd802c2 Config: Cleanup and organize functions
Remove access of private attributes and use safer functions. Also
move generalized functions into utils files.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-14 21:48:39 -04:00
TerminalMan
0903f852db add export openAPI to config 2024-09-15 00:17:36 +01:00
TerminalMan
533e7c9119 remove unnecessary code 2024-09-14 22:49:37 +01:00