tabbyAPI-ollama/README.md
kingbri 03f45cb0a3 Tree: Update documentation and configs
Signed-off-by: kingbri <bdashore3@proton.me>
2023-11-16 02:30:33 -05:00

97 lines
2.9 KiB
Markdown

# TabbyAPI
A FastAPI based application that allows for generating text using an LLM (large language model) using the [exllamav2 backend](https://github.com/turboderp/exllamav2).
## Disclaimer
This API is still in the alpha phase. There may be bugs and changes down the line. Please be aware that you might need to reinstall dependencies if needed.
## Prerequisites
To get started, make sure you have the following installed on your system:
- Python 3.x (preferably 3.11) with pip
- CUDA 12.1 or 11.8
NOTE: For Flash Attention 2 to work on Windows, CUDA 12.1 **must** be installed!
## Installing
1. Clone this repository to your machine: `git clone https://github.com/theroyallab/tabbyAPI`
2. Navigate to the project directory: `cd tabbyAPI`
3. Create a virtual environment:
1. `python -m venv venv`
2. On Windows: `.\venv\Scripts\activate`. On Linux: `source venv/bin/activate`
4. Install torch using the instructions found [here](https://pytorch.org/get-started/locally/)
5. Install an exllamav2 wheel from [here](https://github.com/turboderp/exllamav2/releases):
1. Find the version that corresponds with your cuda and python version. For example, a wheel with `cu121` and `cp311` corresponds to CUDA 12.1 and python 3.11
6. Install the other requirements via: `pip install -r requirements.txt`
## Configuration
Copy over `config_sample.yml` to `config.yml`. All the fields are commented, so make sure to read the descriptions and comment out or remove fields that you don't need.
## Launching the Application
1. Make sure you are in the project directory and entered into the venv
2. Run the tabbyAPI application: `python main.py`
## API Documentation
Docs can be accessed once you launch the API at `http://<your-IP>:<your-port>/docs`
If you use the default YAML config, it's accessible at `http://localhost:5000/docs`
## Authentication
TabbyAPI uses an API key and admin key to authenticate a user's request. On first launch of the API, a file called `api_tokens.yml` will be generated with fields for the admin and API keys.
If you feel that the keys have been compromised, delete `api_tokens.yml` and the API will generate new keys for you.
API keys and admin keys can be provided via:
- `x-api-key` and `x-admin-key` respectively
- `Authorization` with the `Bearer ` prefix
DO NOT share your admin key unless you want someone else to load/unload a model from your system!
#### Authentication Requrirements
All routes require an API key except for the following which require an **admin** key
- `/v1/model/load`
- `/v1/model/unload`
## Contributing
If you have issues with the project:
- Describe the issues in detail
- If you have a feature request, please indicate it as such.
If you have a Pull Request
- Describe the pull request in detail, what, and why you are changing something
## Developers and Permissions
Creators/Developers:
- kingbri
- Splice86
- Turboderp