Commit graph

46 commits

Author SHA1 Message Date
kingbri
5fb9cdc2b1 Dependencies: Add Python 3.12 specific dependencies
Install a prebuilt fastparquet wheel for Windows and add setuptools
since torch may require it for some reason.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-08-03 17:43:14 -04:00
kingbri
e66d213aef Revert "Dependencies: Use hosted pip index instead of Github"
This reverts commit f111052e39.

This was a bad idea since the netlify server has limited bandwidth.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-08-03 11:35:26 -04:00
kingbri
b124797949 Dependencies: Re-add sentence-transformers
This is actually required for infinity to load a model.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-08-02 14:35:58 -04:00
kingbri
56619810bf Dependencies: Switch sentence-transformers to infinity-emb
Leftover before the transition.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-08-02 13:34:47 -04:00
Brian Dashore
1bf062559d
Merge pull request #158 from AlpinDale/embeddings
feat: add embeddings support via Infinity-emb
2024-07-31 20:33:12 -04:00
kingbri
f111052e39 Dependencies: Use hosted pip index instead of Github
Installing directly from github causes pip's HTTP cache to not
recognize that the correct version of a package is already installed.
This causes a redownload.

When using the Start.bat script, it updates dependencies automatically
to keep users on the latest versions of a package for security reasons.

A simple pip cache website helps alleviate this problem and allows pip
to find the cached wheels when invoked with an upgrade argument.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-07-30 20:46:37 -04:00
kingbri
d85414738d Dependencies: Update Flash Attention 2
v2.6.3 with torch 2.3 wheels.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-07-28 13:50:15 -04:00
kingbri
c79e0832d5 Revert "Dependencies: Update pytorch and flash_attention"
This reverts commit f47d96790c.

See https://github.com/pytorch/pytorch/issues/131662 for more information.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-07-28 13:49:04 -04:00
kingbri
f47d96790c Dependencies: Update pytorch and flash_attention
v2.4.0 and v2.6.3

Also use torch 2.4 wheels.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-07-25 23:39:52 -04:00
AlpinDale
f20cd330ef feat: add embeddings support via sentence-transformers 2024-07-26 02:45:07 +00:00
kingbri
a1c3f6cc1c Dependencies: Update ExllamaV2
v0.1.8

Signed-off-by: kingbri <bdashore3@proton.me>
2024-07-24 22:00:43 -04:00
kingbri
27f9559d83 Dependencies: Switch to fastapi-slim
Reduces dependency size since the full fastapi package isn't required.
Add httptools since it makes requests faster and it was installed
with fastapi previously.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-07-24 21:59:56 -04:00
kingbri
42bc4adcfb Config: Add option to set priority to realtime
Realtime process priority assigns resources to point to tabby's
processes. Running as administrator will give realtime priority
while running as a normal user will set as high priority.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-07-24 21:50:06 -04:00
kingbri
5c082b7e8c Async: Add option to use Uvloop/Winloop
These are faster event loops for asyncio which should improve overall
performance. Gate these under an experimental flag for now to stress
test these loops.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-07-24 18:59:20 -04:00
kingbri
5917515696 Dependencies: Update flash-attention
v2.6.1

Signed-off-by: kingbri <bdashore3@proton.me>
2024-07-12 10:09:49 -04:00
kingbri
073e9fa6f0 Dependencies: Bump ExllamaV2
v0.1.7

Signed-off-by: kingbri <bdashore3@proton.me>
2024-07-11 14:22:50 -04:00
kingbri
e58e197f0b Ruff: Remove deprecated rule E999
Syntax error is removed since they'll always be shown when linting
anyways.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-07-08 12:36:15 -04:00
kingbri
c5ea2abe24 Dependencies: Update ExllamaV2
v0.1.6

Signed-off-by: kingbri <bdashore3@proton.me>
2024-06-23 21:45:04 -04:00
kingbri
d85b526644 Dependencies: Pin numpy
v2.x breaks many upstream dependencies (torch). Pin until repos are
fixed.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-06-23 21:40:09 -04:00
DocShotgun
107436f601
Dependencies: Fix AMD triton (#139) 2024-06-18 15:19:27 +02:00
DocShotgun
55d979b7a5
Update dependencies, support Python 3.12, update for exl2 0.1.5 (#134)
* Dependencies: Add wheels for Python 3.12

* Model: Switch fp8 cache to Q8 cache

* Model: Add ability to set draft model cache mode

* Dependencies: Bump exllamav2 to 0.1.5

* Model: Support Q6 cache

* Config: Add Q6 cache and draft_cache_mode to config sample
2024-06-09 17:27:39 +02:00
turboderp
e889fa3efe
Bump exllamav2 to v0.1.4 (#128) 2024-06-04 02:32:08 +02:00
kingbri
8d31a5aed1 Dependencies: Update Flash Attention 2
v2.5.9.post1

Signed-off-by: kingbri <bdashore3@proton.me>
2024-05-28 00:45:35 -04:00
kingbri
19961f4126 Dependencies: Update ExllamaV2
v0.1.1

Signed-off-by: kingbri <bdashore3@proton.me>
2024-05-27 13:38:07 -04:00
kingbri
47582c2440 Dependencies: Update ExllamaV2
v0.1.0

Signed-off-by: kingbri <bdashore3@proton.me>
2024-05-25 21:16:14 -04:00
kingbri
cd78728a77 Dependencies: Update ExllamaV2
v0.0.21

Signed-off-by: kingbri <bdashore3@proton.me>
2024-05-11 19:26:03 -04:00
Arseniy Bakharovsky
33c86be45c
Update pyproject.toml 2024-05-08 03:31:15 +04:00
kingbri
55ccd1baad API: Add HuggingFace downloader
Adds an asynchronous huggingface downloader that uses HF hub to fetch
all repo files. The current HF hub package has a snapshot_download
function that does not cancel on KeyboardInterrupt.

Instead, make a downloader that uses the Rich progress bar styling
along with a cancellable interface. Finally, link this to TabbyAPI.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-04-29 01:15:02 -04:00
kingbri
fb01b164d8 Dependencies: Update flash attention 2
v2.5.8

Signed-off-by: kingbri <bdashore3@proton.me>
2024-04-28 11:07:00 -04:00
kingbri
0e015ad58e Dependencies: Update ExllamaV2
v0.0.20

ROCm 6.0 is now required

Signed-off-by: kingbri <bdashore3@proton.me>
2024-04-28 11:06:59 -04:00
kingbri
3de93d7c0a Dependencies: Update torch
v2.3.0

NOTE: ROCm is updated to v6.0 wheels

Signed-off-by: kingbri <bdashore3@proton.me>
2024-04-28 11:06:17 -04:00
kingbri
4daa6390a5 Dependencies: Unpin lm-format-enforcer
It should be fine to use the stable version from now on.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-04-28 11:06:17 -04:00
kingbri
1e56d43772 Dependencies: Update lm-format-enforcer
v0.9.8

Signed-off-by: kingbri <bdashore3@proton.me>
2024-04-22 21:33:28 -04:00
kingbri
933c5afef0 Dependencies: Update ExllamaV2 and lm-format-enforcer
ExllamaV2: v0.0.19
lmfe: v0.9.6

Signed-off-by: kingbri <bdashore3@proton.me>
2024-04-19 21:15:50 -04:00
kingbri
ed05f376d9 Dependencies: Switch to LM-format-enforcer fork
LM format enforcer has some latency on token ingestion, so use an
optimized fork instead. Also add this in as a base dependency since
the size is small.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-04-14 11:59:49 -04:00
kingbri
30c4554572 Requirements: Update Exllamav2
v0.0.18

Signed-off-by: kingbri <bdashore3@proton.me>
2024-04-07 18:00:56 -04:00
kingbri
f534930270 Dependencies: Bump Exllamav2
v0.0.17

Signed-off-by: kingbri <bdashore3@proton.me>
2024-03-31 23:10:28 -04:00
kingbri
05b5700334 Dependencies: Update torch
v2.2.2

Signed-off-by: kingbri <bdashore3@proton.me>
2024-03-30 17:03:37 -04:00
kingbri
5c94894a1a Dependencies: Update Flash Attention
v2.5.6

Signed-off-by: kingbri <bdashore3@proton.me>
2024-03-30 16:58:24 -04:00
kingbri
d4280e1378 Dependencies: Add pytorch-triton-rocm
Required for AMD installs.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-03-28 11:02:56 -04:00
kingbri
26496c4db2 Dependencies: Require tokenizers
This is used for some models and isn't too big in size (compared to
other huggingface dependencies), so include it by default.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-03-23 01:12:21 -04:00
kingbri
37a80334a8 Dependencies: Add packaging
This is a required dependency.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-03-21 11:27:27 -04:00
kingbri
345bcc30c7 Dependencies: Add extras feature
Installs all optional dependencies to the venv.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-03-21 00:09:38 -04:00
kingbri
7020a0a2d1 Dependencies: Update Exllamav2
v0.0.16

Signed-off-by: kingbri <bdashore3@proton.me>
2024-03-20 15:21:37 -04:00
kingbri
061e1d94c2 Ruff: Migrate to pyproject
Removes unnecessary ruff.toml.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-03-20 15:21:37 -04:00
kingbri
b1ca435695 Tree: Add pyproject.toml
This will manage dependencies from now on since it's a more flexible
file that's similar to other packaging utilities like npm and cargo.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-03-20 15:21:37 -04:00