Update README

Include ExllamaV3

Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
This commit is contained in:
kingbri 2025-06-13 14:55:36 -04:00
parent 691a080ac7
commit f4ee56ba13

View file

@ -34,9 +34,9 @@
>
> Want to run GGUF models? Take a look at [YALS](https://github.com/theroyallab/YALS), TabbyAPI's sister project.
A FastAPI based application that allows for generating text using an LLM (large language model) using the [Exllamav2 backend](https://github.com/turboderp/exllamav2)
A FastAPI based application that allows for generating text using an LLM (large language model) using the [Exllamav2](https://github.com/turboderp-org/exllamav2) and [Exllamav3](https://github.com/turboderp-org/exllamav3) backends.
TabbyAPI is also the official API backend server for ExllamaV2.
TabbyAPI is also the official API backend server for ExllamaV2 and V3.
## Disclaimer
@ -78,10 +78,12 @@ And much more. If something is missing here, PR it in!
## Supported Model Types
TabbyAPI uses Exllamav2 as a powerful and fast backend for model inference, loading, etc. Therefore, the following types of models are supported:
TabbyAPI uses Exllama as a powerful and fast backend for model inference, loading, etc. Therefore, the following types of models are supported:
- Exl2 (Highly recommended)
- Exl3 (Highly recommended)
- GPTQ
- FP16 (using Exllamav2's loader)
@ -106,7 +108,8 @@ If you have a Pull Request
TabbyAPI would not exist without the work of other contributors and FOSS projects:
- [ExllamaV2](https://github.com/turboderp/exllamav2)
- [ExllamaV2](https://github.com/turboderp-org/exllamav2)
- [ExllamaV3](https://github.com/turboderp-org/exllamav3)
- [Aphrodite Engine](https://github.com/PygmalionAI/Aphrodite-engine)
- [infinity-emb](https://github.com/michaelfeil/infinity)
- [FastAPI](https://github.com/fastapi/fastapi)