turboderp
d357f100d0
Dependencies: Bump ExllamaV3
2025-06-15 19:12:45 +02:00
turboderp
691a080ac7
Dependencies: Bump ExllamaV3 and ExllamaV2
2025-05-31 23:55:04 +02:00
kingbri
fa534fe551
Dependencies: Update Ruff
...
v0.11.10
Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
2025-05-17 00:46:25 -04:00
kingbri
c9dc0b2aa4
Dependencies: Bump ExllamaV3 and ExllamaV2
...
v0.0.2 and v0.3.0 respectively
Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
2025-05-12 15:29:31 -04:00
kingbri
33ac016023
Dependencies: Add ExllamaV3
...
v0.0.1
Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
2025-05-09 21:42:07 -04:00
kingbri
2b3ed3fc79
Dependencies: Switch back to official exl2 wheels
...
These wheels are built properly and have the correct version and
filename.
Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
2025-04-26 21:27:28 -04:00
kingbri
eb435f79e3
Dependencies (TEMP): Use my wheels for exl2
...
Use these until exl2 updates its wheels to have the version equal the
filename.
Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
2025-04-26 02:11:33 -04:00
kingbri
136c8139f9
Dependencies: Update PyTorch, Exllamav2, and FA2
...
PyTorch: v2.7.0 on cuda 128 + ROCm 6.3
Exllamav2: v0.2.9
FA2: v2.7.4.post1 on cuda 128
Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
2025-04-24 21:52:48 -04:00
kingbri
9834c7f99b
Dependencies: Ungate numpy
...
numpy v2 now works with Torch
Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
2025-04-21 23:14:14 -04:00
kingbri
0dcbb7a722
Dependencies: Update torch, exllamav2, and flash-attn
...
Torch - 2.6.0
ExllamaV2 - 0.2.8
Flash-attn - 2.7.4.post1
Cuda wheels are now 12.4 instead of 12.1, feature names need to be
migrated over.
Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
2025-02-09 01:27:48 -05:00
Jakub Filo
f8d9cfb5fd
Bump formatron to 0.4.11
2025-01-08 00:48:25 +01:00
kingbri
cfb439c0e6
Dependencies: Update exllamav2 and pytorch for ROCm
...
Exllama v0.2.7, pytorch v2.5.1 across all cards.
AMD now requires ROCm 6.2
Signed-off-by: kingbri <8082010+bdashore3@users.noreply.github.com>
2025-01-01 16:22:10 -05:00
kingbri
fa8035ef72
Dependencies: Update sse-starlette and formatron
...
Also pin newer versions of dependencies and fix an import from sse-starlette
Signed-off-by: kingbri <8082010+bdashore3@users.noreply.github.com>
2024-12-21 23:14:55 -05:00
kingbri
bc3c154c96
Dependencies: Pin tokenizers
...
Use a version greater than 0.20.0 for newer model support.
Signed-off-by: kingbri <8082010+bdashore3@users.noreply.github.com>
2024-12-13 00:58:25 -05:00
kingbri
f25ac4b833
Dependencies: Update ExllamaV2
...
v0.2.6
Signed-off-by: kingbri <8082010+bdashore3@users.noreply.github.com>
2024-12-13 00:47:29 -05:00
kingbri
8ccd7a12a2
Merge branch 'main' into formatron
2024-12-05 23:01:22 -05:00
kingbri
ac85e34356
Depenedencies: Update Torch, FA2, and Exl2
...
Torch: 2.5, FA2 2.7.0.post2, Exl2 v0.2.5
Don't update torch for rocm as exl2 isn't built for rocm 6.2
Signed-off-by: kingbri <8082010+bdashore3@users.noreply.github.com>
2024-12-03 22:57:00 -05:00
kingbri
ca86ab5477
Dependencies: Remove CUDA 11.8
...
Most software has moved to CUDA 12 and cards that aren't supported by
11.8 don't use tabby anyways.
Signed-off-by: kingbri <8082010+bdashore3@users.noreply.github.com>
2024-12-03 22:37:03 -05:00
kingbri
3c4211c963
Dependencies: Ensure updated kbnf
...
Signed-off-by: kingbri <8082010+bdashore3@users.noreply.github.com>
2024-12-02 15:10:20 -05:00
DocShotgun
0836a9317f
Grammar: Initial Formatron regex and JSON schema implementation
...
* Replace LMFE's regex and JSON schema filters with Formatron's
* Remove Outlines EBNF filter in preparation for Formatron KBNF filter
* TODO: Implement Formatron KBNF filter
2024-11-23 10:27:37 -08:00
kingbri
9cd7fcaf99
Pyproject: Add pillow to deps
...
Signed-off-by: kingbri <bdashore3@proton.me>
2024-11-22 17:48:56 -05:00
kingbri
0fadb1e5e8
Merge branch 'main' into vision
2024-11-19 21:19:21 -05:00
DocShotgun
dd41eec8a4
OAI: Initial vision support in OAI chat completions
...
* Support image_url inputs containing URLs or base64 strings following OAI vision spec
* Use async lru cache for image embeddings
* Add generic wrapper class for multimodal embeddings
2024-11-17 21:23:09 -08:00
kingbri
69838e92ca
Dependencies: Update ExllamaV2
...
v0.2.4
Signed-off-by: kingbri <bdashore3@proton.me>
2024-11-13 22:16:11 -05:00
kingbri
6726014d35
Dependencies: Update ExllamaV2
...
v0.2.3
Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-30 00:17:12 -04:00
kingbri
b4cda78bcc
Dependencies: Update Ruff
...
v0.6.5
Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-19 22:39:08 -04:00
kingbri
c616b3b1ee
Dependencies: Update PyTorch
...
v2.4.1 and update all associated wheels to use their 2.4 versions.
Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-19 22:32:23 -04:00
TerminalMan
948fcb7f5b
migrate to ruamel.yaml
2024-09-18 01:06:34 +01:00
turboderp
318c425d84
Bump exllamav2 to 0.2.2
2024-09-14 21:43:26 +02:00
kingbri
cf97113868
Dependencies: Update Exllamav2
...
v0.2.1
Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-08 21:12:31 -04:00
kingbri
96fce34253
Dependencies: Update ExllamaV2
...
v0.2.0
Signed-off-by: kingbri <bdashore3@proton.me>
2024-08-28 18:34:00 -04:00
kingbri
565b0300d6
Dependencies: Update Exllamav2
...
v0.1.9
Signed-off-by: kingbri <bdashore3@proton.me>
2024-08-22 14:15:19 -04:00
kingbri
5fb9cdc2b1
Dependencies: Add Python 3.12 specific dependencies
...
Install a prebuilt fastparquet wheel for Windows and add setuptools
since torch may require it for some reason.
Signed-off-by: kingbri <bdashore3@proton.me>
2024-08-03 17:43:14 -04:00
kingbri
e66d213aef
Revert "Dependencies: Use hosted pip index instead of Github"
...
This reverts commit f111052e39 .
This was a bad idea since the netlify server has limited bandwidth.
Signed-off-by: kingbri <bdashore3@proton.me>
2024-08-03 11:35:26 -04:00
kingbri
b124797949
Dependencies: Re-add sentence-transformers
...
This is actually required for infinity to load a model.
Signed-off-by: kingbri <bdashore3@proton.me>
2024-08-02 14:35:58 -04:00
kingbri
56619810bf
Dependencies: Switch sentence-transformers to infinity-emb
...
Leftover before the transition.
Signed-off-by: kingbri <bdashore3@proton.me>
2024-08-02 13:34:47 -04:00
Brian Dashore
1bf062559d
Merge pull request #158 from AlpinDale/embeddings
...
feat: add embeddings support via Infinity-emb
2024-07-31 20:33:12 -04:00
kingbri
f111052e39
Dependencies: Use hosted pip index instead of Github
...
Installing directly from github causes pip's HTTP cache to not
recognize that the correct version of a package is already installed.
This causes a redownload.
When using the Start.bat script, it updates dependencies automatically
to keep users on the latest versions of a package for security reasons.
A simple pip cache website helps alleviate this problem and allows pip
to find the cached wheels when invoked with an upgrade argument.
Signed-off-by: kingbri <bdashore3@proton.me>
2024-07-30 20:46:37 -04:00
kingbri
d85414738d
Dependencies: Update Flash Attention 2
...
v2.6.3 with torch 2.3 wheels.
Signed-off-by: kingbri <bdashore3@proton.me>
2024-07-28 13:50:15 -04:00
kingbri
c79e0832d5
Revert "Dependencies: Update pytorch and flash_attention"
...
This reverts commit f47d96790c .
See https://github.com/pytorch/pytorch/issues/131662 for more information.
Signed-off-by: kingbri <bdashore3@proton.me>
2024-07-28 13:49:04 -04:00
kingbri
f47d96790c
Dependencies: Update pytorch and flash_attention
...
v2.4.0 and v2.6.3
Also use torch 2.4 wheels.
Signed-off-by: kingbri <bdashore3@proton.me>
2024-07-25 23:39:52 -04:00
AlpinDale
f20cd330ef
feat: add embeddings support via sentence-transformers
2024-07-26 02:45:07 +00:00
kingbri
a1c3f6cc1c
Dependencies: Update ExllamaV2
...
v0.1.8
Signed-off-by: kingbri <bdashore3@proton.me>
2024-07-24 22:00:43 -04:00
kingbri
27f9559d83
Dependencies: Switch to fastapi-slim
...
Reduces dependency size since the full fastapi package isn't required.
Add httptools since it makes requests faster and it was installed
with fastapi previously.
Signed-off-by: kingbri <bdashore3@proton.me>
2024-07-24 21:59:56 -04:00
kingbri
42bc4adcfb
Config: Add option to set priority to realtime
...
Realtime process priority assigns resources to point to tabby's
processes. Running as administrator will give realtime priority
while running as a normal user will set as high priority.
Signed-off-by: kingbri <bdashore3@proton.me>
2024-07-24 21:50:06 -04:00
kingbri
5c082b7e8c
Async: Add option to use Uvloop/Winloop
...
These are faster event loops for asyncio which should improve overall
performance. Gate these under an experimental flag for now to stress
test these loops.
Signed-off-by: kingbri <bdashore3@proton.me>
2024-07-24 18:59:20 -04:00
kingbri
5917515696
Dependencies: Update flash-attention
...
v2.6.1
Signed-off-by: kingbri <bdashore3@proton.me>
2024-07-12 10:09:49 -04:00
kingbri
073e9fa6f0
Dependencies: Bump ExllamaV2
...
v0.1.7
Signed-off-by: kingbri <bdashore3@proton.me>
2024-07-11 14:22:50 -04:00
kingbri
e58e197f0b
Ruff: Remove deprecated rule E999
...
Syntax error is removed since they'll always be shown when linting
anyways.
Signed-off-by: kingbri <bdashore3@proton.me>
2024-07-08 12:36:15 -04:00
kingbri
c5ea2abe24
Dependencies: Update ExllamaV2
...
v0.1.6
Signed-off-by: kingbri <bdashore3@proton.me>
2024-06-23 21:45:04 -04:00