Commit graph

1051 commits

Author SHA1 Message Date
kingbri
69ac0eb8aa Model: Add vision loading support
Adds the ability to load vision parts of text + image models. Requires
an explicit flag in config because there isn't a way to automatically
determine whether the vision tower should be used.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-11-11 12:10:11 -05:00
kingbri
cc2516790d Model: Add support for chat_template.json
HuggingFace separated the chat template in the newest transformers
versions.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-11-11 12:10:06 -05:00
kingbri
9530f8c8c7 Model: Add support for chat_template.json
HuggingFace separated the chat template in the newest transformers
versions.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-11-11 12:09:27 -05:00
AlpinDale
c9ff8ef2c2 upgrade to v0.2 2024-11-04 13:28:04 +00:00
AlpinDale
1c9bc2d1af feat: add serviceinfo URI 2024-11-04 12:35:08 +00:00
Brian Dashore
b8700fbbc3
Merge pull request #230 from DocShotgun/main
Remove override_base_seq_len
2024-11-02 12:24:18 -04:00
DocShotgun
603760cecb Model: Remove override_base_seq_len 2024-10-30 10:03:08 +08:00
TerminalMan
7d18d2e2ca
Refactor the sampling class (#199)
* improve validation

* remove to_gen_params functions

* update changes for all endpoint types

* OAI: Fix calls to generation

Chat completion and completion need to have prompt split out before
pushing to the backend.

Signed-off-by: kingbri <bdashore3@proton.me>

* Sampling: Convert Top-K values of -1 to 0

Some OAI implementations use -1 as disabled instead of 0. Therefore,
add a coalesce case.

Signed-off-by: kingbri <bdashore3@proton.me>

* Sampling: Format and space out

Make the code more readable.

Signed-off-by: kingbri <bdashore3@proton.me>

* Sampling: Fix mirostat

Field items are nested in data within a Pydantic FieldInfo

Signed-off-by: kingbri <bdashore3@proton.me>

* Sampling: Format

Signed-off-by: kingbri <bdashore3@proton.me>

* Sampling: Fix banned_tokens and allowed_tokens conversion

If the provided string has whitespace, trim it before splitting.

Signed-off-by: kingbri <bdashore3@proton.me>

* Sampling: Add helpful log to dry_sequence_breakers

Let the user know if the sequence errors out.

Signed-off-by: kingbri <bdashore3@proton.me>

* Sampling: Apply validators in right order

Validators need to be applied in order from top to bottom, this is why
the after validator was not being applied properly.

Set the model to validate default params for sampler override purposes.
This can be turned off if there are unclear errors.

Signed-off-by: kingbri <bdashore3@proton.me>

* Endpoints: Format

Cleanup and semantically fix field validators

Signed-off-by: kingbri <bdashore3@proton.me>

* Kobold: Update validators and fix parameter application

Validators on parent fields cannot see child fields. Therefore,
validate using the child fields instead and alter the parent field
data from there.

Also fix badwordsids casting.

Signed-off-by: kingbri <bdashore3@proton.me>

* Sampling: Remove validate defaults and fix mirostat

If a user sets an override to a non-default value, that's their
own fault.

Run validator on the actual mirostat_mode parameter rather than
the alternate mirostat parameter.

Signed-off-by: kingbri <bdashore3@proton.me>

* Kobold: Rework badwordsids

Currently, this serves to ban the EOS token. All other functionality
was legacy, so remove it.

Signed-off-by: kingbri <bdashore3@proton.me>

* Model: Remove HuggingfaceConfig

This was only necessary for badwordsids. All other fields are handled
by exl2. Keep the class as a stub if it's needed again.

Signed-off-by: kingbri <bdashore3@proton.me>

* Kobold: Bump kcpp impersonation

TabbyAPI supports XTC now.

Signed-off-by: kingbri <bdashore3@proton.me>

* Sampling: Change alias to validation_alias

Reduces the probability for errors and makes the class consistent.

Signed-off-by: kingbri <bdashore3@proton.me>

* OAI: Use constraints for validation

Instead of adding a model_validator, use greater than or equal to
constraints provided by Pydantic.

Signed-off-by: kingbri <bdashore3@proton.me>

* Tree: Lint

Signed-off-by: kingbri <bdashore3@proton.me>

---------

Co-authored-by: SecretiveShell <84923604+SecretiveShell@users.noreply.github.com>
Co-authored-by: kingbri <bdashore3@proton.me>
2024-10-27 11:43:41 -04:00
Brian Dashore
6e48bb420a
Model: Fix inline loading and draft key (#225)
* Model: Fix inline loading and draft key

There was a lack of foresight between the new config.yml and how
it was structured. The "draft" key became "draft_model" without updating
both the API request and inline loading keys.

For the API requests, still support "draft" as legacy, but the "draft_model"
key is preferred.

Signed-off-by: kingbri <bdashore3@proton.me>

* OAI: Add draft model dir to inline load

Was not pushed before and caused errors of the kwargs being None.

Signed-off-by: kingbri <bdashore3@proton.me>

* Model: Fix draft args application

Draft model args weren't applying since there was a reset due to how
the old override behavior worked.

Signed-off-by: kingbri <bdashore3@proton.me>

* OAI: Change embedding model load params

Use embedding_model_name to be inline with the config.

Signed-off-by: kingbri <bdashore3@proton.me>

* API: Fix parameter for draft model load

Alias name to draft_model_name.

Signed-off-by: kingbri <bdashore3@proton.me>

* API: Fix parameter for template switch

Add prompt_template_name to be more descriptive.

Signed-off-by: kingbri <bdashore3@proton.me>

* API: Fix parameter for model load

Alias name to model_name for config parity.

Signed-off-by: kingbri <bdashore3@proton.me>

* API: Add alias documentation

Signed-off-by: kingbri <bdashore3@proton.me>

---------

Signed-off-by: kingbri <bdashore3@proton.me>
2024-10-24 23:35:05 -04:00
kingbri
f20857cb34 Model: Fix override application
None values weren't being excluded on initial load when dumping.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-30 00:41:23 -04:00
kingbri
126a44483c Tree: Remove fasttensors
Now a noop in upstream.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-30 00:18:47 -04:00
kingbri
6726014d35 Dependencies: Update ExllamaV2
v0.2.3

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-30 00:17:12 -04:00
kingbri
56ce82ef77 Sampling: Add XTC support
Matches with upstream.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-24 18:10:52 -04:00
TerminalMan
f4791e7ed9
Cleanup config file loader (#208)
* fix config file loader

* prune nonetype values from config dict

fixes default values not initialising properly

* Utils: Shrink None removal function

It is more concise to use a list and dict collection if necessary
rather than iterating through and checking each value. Tested and
works with Tabby's cases.

Signed-off-by: kingbri <bdashore3@proton.me>

---------

Signed-off-by: kingbri <bdashore3@proton.me>
Co-authored-by: kingbri <bdashore3@proton.me>
2024-09-23 21:42:01 -04:00
kingbri
fb903ecddf OAI: Relax role requirement for chat completion message lists
Make it so any message role can be parsed from a list. Not really
sure why this is the case because system and assistant shouldn't be
sending data other than text, but it also doesn't make much sense
to be extremely strict with roles either.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-22 22:42:06 -04:00
TerminalMan
2cda890deb
Add health check monitoring for EXL2 errors (#206)
* Add health check monitoring for EXL2 errors

* Health: Format and change status code

A status code of 503 makes more sense to use.
---------
2024-09-22 21:40:36 -04:00
kingbri
e0ffa90865 Dependencies: Change handling of exllamav2 checks
ExllamaV2 should check for solely exllamav2, otherwise errors don't
make sense. Migrate the combined "exl2" computed property to "inference"
since those are the required dependencies for minimal inference.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-22 12:57:28 -04:00
kingbri
5380b3fe5e Tree: Format
Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-21 14:37:01 -04:00
kingbri
3c8384ee71 Start: Fix startup with new argparser
Since the full argparser requires pydantic, gate it until all dependencies
are installed.

Also if the venv is deleted, assume that start_options.json is invalid
as well.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-21 14:36:21 -04:00
kingbri
16abaf0922 Tree: Format
Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-19 23:45:28 -04:00
kingbri
d5e4285346 Signals: Split signal handler between sync and async
Asyncio requires a closure of the event loop while sync can use SystemExit
to kill the program.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-19 23:31:29 -04:00
kingbri
b4cda78bcc Dependencies: Update Ruff
v0.6.5

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-19 22:39:08 -04:00
kingbri
c616b3b1ee Dependencies: Update PyTorch
v2.4.1 and update all associated wheels to use their 2.4 versions.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-19 22:32:23 -04:00
TerminalMan
3aeddc5255
fix issues with optional dependencies (#204)
* fix issues with optional dependencies

* format document

* Tree: Format and comment
2024-09-19 22:24:55 -04:00
kingbri
75af974c88 Model: Raise an error if the context length is too large
The dynamic generator gave a not-so-helpful exception already which
basically said to not exceed the max sequence length. Instead of
possible undefined behavior, error out.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-19 22:05:56 -04:00
kingbri
b30336c75b Tree: Format
Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-18 21:42:01 -04:00
kingbri
edf3a00310 Config: Make API server literals case insensitive
There's no native way to handle case insensitivity in pydantic, so
add a validator which converts the API server input to be lowercase.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-18 21:39:18 -04:00
kingbri
2fd02cf4fc Startup actions: Add openapi var check
This is required to exit once the openapi spec is created.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-18 21:08:45 -04:00
kingbri
ac4b3100d0 Actions: Fix pages build
The args openapi export does not work, so use environment vars for
the time being.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-18 21:02:56 -04:00
Brian Dashore
03189bcb6f
Merge pull request #189 from SecretiveShell/pydantic-config
Update the config system to use Pydantic internally, bridging the gap between the YAML and args. YAML is still the preferred method to configure TabbyAPI, but args are no longer separately maintained.
2024-09-18 20:41:41 -04:00
kingbri
4cf85514f7 Tree: Format
Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-18 20:36:17 -04:00
kingbri
24ea85b3c5 Tree: Use safe loader for YAML
Loaders that read use a safe type while loaders that write use both
round-trip and safe options.

Also don't create module-level parsers where they're not needed.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-18 19:26:51 -04:00
TerminalMan
6c7542de9f migrate all yaml loaders to ruamel.yaml 2024-09-18 11:33:15 +01:00
kingbri
63634beb5e Config: Clarify Rope alpha options
Leaving blank will use the model's set value or auto-calculate.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-17 23:03:28 -04:00
kingbri
754fb15f23 Config: Fix draft model migration and loading
The loader takes in the "draft" parameter, so map the config model
to that when creating kwargs for initial load.

Also map the old "draft" key to the new "draft_model" key.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-17 22:48:56 -04:00
kingbri
a34bd9a684 Config: Alter YAML generation script for formatting adherence
Properly add comments and newlines where they need to go.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-17 22:44:42 -04:00
TerminalMan
948fcb7f5b migrate to ruamel.yaml 2024-09-18 01:06:34 +01:00
TerminalMan
bb4dd7200e fix defaults for api_servers 2024-09-17 15:41:32 +01:00
kingbri
daa57ceada API: Upgrade config declarations
Some were using the old unwrap methods.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-17 00:42:39 -04:00
kingbri
7fe0dbd62f Tree: Update config_sample
Uses the new YAML generator.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-16 23:32:54 -04:00
kingbri
63f8c46a92 Config: Make a better description for lora config
This is not ideal because users may still have trouble understanding
what a lora includes, but adding an example comment will help instead
of leaving a blank line.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-16 23:29:39 -04:00
kingbri
852ea8faaa Config: Don't load from file if actions present
Loading from file adds extra overhead for actions that don't rely
on file loading.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-16 23:29:07 -04:00
kingbri
ececce172e Config: Fix addition of preamble
Remove the extraneous newlines from the beginning of the preamble.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-16 23:06:01 -04:00
kingbri
f6fb60a6ed Config: Inline model loading is False
This is not a True default.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-16 22:54:35 -04:00
kingbri
8e6b8bd842 Update .gitignore
Ignore all "backup" files

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-16 22:48:13 -04:00
kingbri
26ad0ef744 API: Fix model info reporting
A deprecated preferences global var was being referenced.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-16 22:42:59 -04:00
kingbri
06a798d968 Main: Remove debug print statement for config object
Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-16 22:24:42 -04:00
kingbri
46f9fff210 Config: Move config file generation to tabby_config
Keep the models as a separate reference file without any extra
functions.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-16 22:22:24 -04:00
kingbri
d2d07ed92d Config: Update auto-migration flow
- Let the user know that migration is going to be attempted
- Have a more informative error message if auto-migration fails
- Revert back to the old config file on failure
- Don't load with a partially parsed config

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-16 18:15:50 -04:00
kingbri
ebe7f3567e Config: Alter migration error handling and cleanup
Rollback to the old config if automigration fails.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-16 18:02:18 -04:00