Commit graph

80 commits

Author SHA1 Message Date
kingbri
ab04a6ed60 Dependencies: Bump ExllamaV3
v0.0.5

Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
2025-07-18 22:56:35 -04:00
kingbri
bf936f5c39 Dependencies: Update exllamav2
v0.3.2

Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
2025-07-13 23:33:12 -04:00
turboderp
d357f100d0 Dependencies: Bump ExllamaV3 2025-06-15 19:12:45 +02:00
turboderp
691a080ac7 Dependencies: Bump ExllamaV3 and ExllamaV2 2025-05-31 23:55:04 +02:00
kingbri
fa534fe551 Dependencies: Update Ruff
v0.11.10

Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
2025-05-17 00:46:25 -04:00
kingbri
c9dc0b2aa4 Dependencies: Bump ExllamaV3 and ExllamaV2
v0.0.2 and v0.3.0 respectively

Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
2025-05-12 15:29:31 -04:00
kingbri
33ac016023 Dependencies: Add ExllamaV3
v0.0.1

Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
2025-05-09 21:42:07 -04:00
kingbri
2b3ed3fc79 Dependencies: Switch back to official exl2 wheels
These wheels are built properly and have the correct version and
filename.

Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
2025-04-26 21:27:28 -04:00
kingbri
eb435f79e3 Dependencies (TEMP): Use my wheels for exl2
Use these until exl2 updates its wheels to have the version equal the
filename.

Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
2025-04-26 02:11:33 -04:00
kingbri
136c8139f9 Dependencies: Update PyTorch, Exllamav2, and FA2
PyTorch: v2.7.0 on cuda 128 + ROCm 6.3
Exllamav2: v0.2.9
FA2: v2.7.4.post1 on cuda 128

Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
2025-04-24 21:52:48 -04:00
kingbri
9834c7f99b Dependencies: Ungate numpy
numpy v2 now works with Torch

Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
2025-04-21 23:14:14 -04:00
kingbri
0dcbb7a722 Dependencies: Update torch, exllamav2, and flash-attn
Torch - 2.6.0
ExllamaV2 - 0.2.8
Flash-attn - 2.7.4.post1

Cuda wheels are now 12.4 instead of 12.1, feature names need to be
migrated over.

Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
2025-02-09 01:27:48 -05:00
Jakub Filo
f8d9cfb5fd Bump formatron to 0.4.11 2025-01-08 00:48:25 +01:00
kingbri
cfb439c0e6 Dependencies: Update exllamav2 and pytorch for ROCm
Exllama v0.2.7, pytorch v2.5.1 across all cards.

AMD now requires ROCm 6.2

Signed-off-by: kingbri <8082010+bdashore3@users.noreply.github.com>
2025-01-01 16:22:10 -05:00
kingbri
fa8035ef72 Dependencies: Update sse-starlette and formatron
Also pin newer versions of dependencies and fix an import from sse-starlette

Signed-off-by: kingbri <8082010+bdashore3@users.noreply.github.com>
2024-12-21 23:14:55 -05:00
kingbri
bc3c154c96 Dependencies: Pin tokenizers
Use a version greater than 0.20.0 for newer model support.

Signed-off-by: kingbri <8082010+bdashore3@users.noreply.github.com>
2024-12-13 00:58:25 -05:00
kingbri
f25ac4b833 Dependencies: Update ExllamaV2
v0.2.6

Signed-off-by: kingbri <8082010+bdashore3@users.noreply.github.com>
2024-12-13 00:47:29 -05:00
kingbri
8ccd7a12a2 Merge branch 'main' into formatron 2024-12-05 23:01:22 -05:00
kingbri
ac85e34356 Depenedencies: Update Torch, FA2, and Exl2
Torch: 2.5, FA2 2.7.0.post2, Exl2 v0.2.5

Don't update torch for rocm as exl2 isn't built for rocm 6.2

Signed-off-by: kingbri <8082010+bdashore3@users.noreply.github.com>
2024-12-03 22:57:00 -05:00
kingbri
ca86ab5477 Dependencies: Remove CUDA 11.8
Most software has moved to CUDA 12 and cards that aren't supported by
11.8 don't use tabby anyways.

Signed-off-by: kingbri <8082010+bdashore3@users.noreply.github.com>
2024-12-03 22:37:03 -05:00
kingbri
3c4211c963 Dependencies: Ensure updated kbnf
Signed-off-by: kingbri <8082010+bdashore3@users.noreply.github.com>
2024-12-02 15:10:20 -05:00
DocShotgun
0836a9317f Grammar: Initial Formatron regex and JSON schema implementation
* Replace LMFE's regex and JSON schema filters with Formatron's
* Remove Outlines EBNF filter in preparation for Formatron KBNF filter
* TODO: Implement Formatron KBNF filter
2024-11-23 10:27:37 -08:00
kingbri
9cd7fcaf99 Pyproject: Add pillow to deps
Signed-off-by: kingbri <bdashore3@proton.me>
2024-11-22 17:48:56 -05:00
kingbri
0fadb1e5e8 Merge branch 'main' into vision 2024-11-19 21:19:21 -05:00
DocShotgun
dd41eec8a4 OAI: Initial vision support in OAI chat completions
* Support image_url inputs containing URLs or base64 strings following OAI vision spec
* Use async lru cache for image embeddings
* Add generic wrapper class for multimodal embeddings
2024-11-17 21:23:09 -08:00
kingbri
69838e92ca Dependencies: Update ExllamaV2
v0.2.4

Signed-off-by: kingbri <bdashore3@proton.me>
2024-11-13 22:16:11 -05:00
kingbri
6726014d35 Dependencies: Update ExllamaV2
v0.2.3

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-30 00:17:12 -04:00
kingbri
b4cda78bcc Dependencies: Update Ruff
v0.6.5

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-19 22:39:08 -04:00
kingbri
c616b3b1ee Dependencies: Update PyTorch
v2.4.1 and update all associated wheels to use their 2.4 versions.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-19 22:32:23 -04:00
TerminalMan
948fcb7f5b migrate to ruamel.yaml 2024-09-18 01:06:34 +01:00
turboderp
318c425d84 Bump exllamav2 to 0.2.2 2024-09-14 21:43:26 +02:00
kingbri
cf97113868 Dependencies: Update Exllamav2
v0.2.1

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-08 21:12:31 -04:00
kingbri
96fce34253 Dependencies: Update ExllamaV2
v0.2.0

Signed-off-by: kingbri <bdashore3@proton.me>
2024-08-28 18:34:00 -04:00
kingbri
565b0300d6 Dependencies: Update Exllamav2
v0.1.9

Signed-off-by: kingbri <bdashore3@proton.me>
2024-08-22 14:15:19 -04:00
kingbri
5fb9cdc2b1 Dependencies: Add Python 3.12 specific dependencies
Install a prebuilt fastparquet wheel for Windows and add setuptools
since torch may require it for some reason.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-08-03 17:43:14 -04:00
kingbri
e66d213aef Revert "Dependencies: Use hosted pip index instead of Github"
This reverts commit f111052e39.

This was a bad idea since the netlify server has limited bandwidth.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-08-03 11:35:26 -04:00
kingbri
b124797949 Dependencies: Re-add sentence-transformers
This is actually required for infinity to load a model.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-08-02 14:35:58 -04:00
kingbri
56619810bf Dependencies: Switch sentence-transformers to infinity-emb
Leftover before the transition.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-08-02 13:34:47 -04:00
Brian Dashore
1bf062559d
Merge pull request #158 from AlpinDale/embeddings
feat: add embeddings support via Infinity-emb
2024-07-31 20:33:12 -04:00
kingbri
f111052e39 Dependencies: Use hosted pip index instead of Github
Installing directly from github causes pip's HTTP cache to not
recognize that the correct version of a package is already installed.
This causes a redownload.

When using the Start.bat script, it updates dependencies automatically
to keep users on the latest versions of a package for security reasons.

A simple pip cache website helps alleviate this problem and allows pip
to find the cached wheels when invoked with an upgrade argument.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-07-30 20:46:37 -04:00
kingbri
d85414738d Dependencies: Update Flash Attention 2
v2.6.3 with torch 2.3 wheels.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-07-28 13:50:15 -04:00
kingbri
c79e0832d5 Revert "Dependencies: Update pytorch and flash_attention"
This reverts commit f47d96790c.

See https://github.com/pytorch/pytorch/issues/131662 for more information.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-07-28 13:49:04 -04:00
kingbri
f47d96790c Dependencies: Update pytorch and flash_attention
v2.4.0 and v2.6.3

Also use torch 2.4 wheels.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-07-25 23:39:52 -04:00
AlpinDale
f20cd330ef feat: add embeddings support via sentence-transformers 2024-07-26 02:45:07 +00:00
kingbri
a1c3f6cc1c Dependencies: Update ExllamaV2
v0.1.8

Signed-off-by: kingbri <bdashore3@proton.me>
2024-07-24 22:00:43 -04:00
kingbri
27f9559d83 Dependencies: Switch to fastapi-slim
Reduces dependency size since the full fastapi package isn't required.
Add httptools since it makes requests faster and it was installed
with fastapi previously.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-07-24 21:59:56 -04:00
kingbri
42bc4adcfb Config: Add option to set priority to realtime
Realtime process priority assigns resources to point to tabby's
processes. Running as administrator will give realtime priority
while running as a normal user will set as high priority.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-07-24 21:50:06 -04:00
kingbri
5c082b7e8c Async: Add option to use Uvloop/Winloop
These are faster event loops for asyncio which should improve overall
performance. Gate these under an experimental flag for now to stress
test these loops.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-07-24 18:59:20 -04:00
kingbri
5917515696 Dependencies: Update flash-attention
v2.6.1

Signed-off-by: kingbri <bdashore3@proton.me>
2024-07-12 10:09:49 -04:00
kingbri
073e9fa6f0 Dependencies: Bump ExllamaV2
v0.1.7

Signed-off-by: kingbri <bdashore3@proton.me>
2024-07-11 14:22:50 -04:00