diff --git a/README.md b/README.md index c22dae7..baa64ea 100644 --- a/README.md +++ b/README.md @@ -34,9 +34,9 @@ > > Want to run GGUF models? Take a look at [YALS](https://github.com/theroyallab/YALS), TabbyAPI's sister project. -A FastAPI based application that allows for generating text using an LLM (large language model) using the [Exllamav2 backend](https://github.com/turboderp/exllamav2) +A FastAPI based application that allows for generating text using an LLM (large language model) using the [Exllamav2](https://github.com/turboderp-org/exllamav2) and [Exllamav3](https://github.com/turboderp-org/exllamav3) backends. -TabbyAPI is also the official API backend server for ExllamaV2. +TabbyAPI is also the official API backend server for ExllamaV2 and V3. ## Disclaimer @@ -78,10 +78,12 @@ And much more. If something is missing here, PR it in! ## Supported Model Types -TabbyAPI uses Exllamav2 as a powerful and fast backend for model inference, loading, etc. Therefore, the following types of models are supported: +TabbyAPI uses Exllama as a powerful and fast backend for model inference, loading, etc. Therefore, the following types of models are supported: - Exl2 (Highly recommended) +- Exl3 (Highly recommended) + - GPTQ - FP16 (using Exllamav2's loader) @@ -106,7 +108,8 @@ If you have a Pull Request TabbyAPI would not exist without the work of other contributors and FOSS projects: -- [ExllamaV2](https://github.com/turboderp/exllamav2) +- [ExllamaV2](https://github.com/turboderp-org/exllamav2) +- [ExllamaV3](https://github.com/turboderp-org/exllamav3) - [Aphrodite Engine](https://github.com/PygmalionAI/Aphrodite-engine) - [infinity-emb](https://github.com/michaelfeil/infinity) - [FastAPI](https://github.com/fastapi/fastapi)