diff --git a/README.md b/README.md
index 63caef6..2d8c386 100644
--- a/README.md
+++ b/README.md
@@ -1,127 +1,5 @@
-# TabbyAPI
+# Tabby API Fork
-
-
-
-
-
-
-
-
-
+Added extra endpoints to enable Open Webui / LibreChat
-
-
-
-
-
-
-
-
-
-
-
-
-> [!IMPORTANT]
->
-> In addition to the README, please read the [Wiki](https://github.com/theroyallab/tabbyAPI/wiki/1.-Getting-Started) page for information about getting started!
-
-> [!NOTE]
->
-> Need help? Join the [Discord Server](https://discord.gg/sYQxnuD7Fj) and get the `Tabby` role. Please be nice when asking questions.
-
-> [!NOTE]
->
-> Want to run GGUF models? Take a look at [YALS](https://github.com/theroyallab/YALS), TabbyAPI's sister project.
-
-A FastAPI based application that allows for generating text using an LLM (large language model) using the [Exllamav2](https://github.com/turboderp-org/exllamav2) and [Exllamav3](https://github.com/turboderp-org/exllamav3) backends.
-
-TabbyAPI is also the official API backend server for ExllamaV2 and V3.
-
-## Disclaimer
-
-This project is marked as rolling release. There may be bugs and changes down the line. Please be aware that you might need to reinstall dependencies if needed.
-
-TabbyAPI is a hobby project made for a small amount of users. It is not meant to run on production servers. For that, please look at other solutions that support those workloads.
-
-## Getting Started
-
-> [!IMPORTANT]
->
-> Looking for more information? Check out the Wiki.
-
-For a step-by-step guide, choose the format that works best for you:
-
-📖 Read the [Wiki](https://github.com/theroyallab/tabbyAPI/wiki/01.-Getting-Started) – Covers installation, configuration, API usage, and more.
-
-🎥 Watch the [Video Guide](https://www.youtube.com/watch?v=03jYz0ijbUU) – A hands-on walkthrough to get you up and running quickly.
-
-## Features
-
-- OpenAI compatible API
-- Loading/unloading models
-- HuggingFace model downloading
-- Embedding model support
-- JSON schema + Regex + EBNF support
-- AI Horde support
-- Speculative decoding via draft models
-- Multi-lora with independent scaling (ex. a weight of 0.9)
-- Inbuilt proxy to override client request parameters/samplers
-- Flexible Jinja2 template engine for chat completions that conforms to HuggingFace
-- Concurrent inference with asyncio
-- Utilizes modern python paradigms
-- Continuous batching engine using paged attention
-- Fast classifier-free guidance
-- OAI style tool/function calling
-
-And much more. If something is missing here, PR it in!
-
-## Supported Model Types
-
-TabbyAPI uses Exllama as a powerful and fast backend for model inference, loading, etc. Therefore, the following types of models are supported:
-
-- Exl2 (Highly recommended)
-
-- Exl3 (Highly recommended)
-
-- GPTQ
-
-- FP16 (using Exllamav2's loader)
-
-In addition, TabbyAPI supports parallel batching using paged attention for Nvidia Ampere GPUs and higher.
-
-## Contributing
-
-Use the template when creating issues or pull requests, otherwise the developers may not look at your post.
-
-If you have issues with the project:
-
-- Describe the issue in detail
-
-- If you have a feature request, please indicate it as such.
-
-If you have a Pull Request
-
-- Describe the pull request in detail, what, and why you are changing something
-
-## Acknowldgements
-
-TabbyAPI would not exist without the work of other contributors and FOSS projects:
-
-- [ExllamaV2](https://github.com/turboderp-org/exllamav2)
-- [ExllamaV3](https://github.com/turboderp-org/exllamav3)
-- [Aphrodite Engine](https://github.com/PygmalionAI/Aphrodite-engine)
-- [infinity-emb](https://github.com/michaelfeil/infinity)
-- [FastAPI](https://github.com/fastapi/fastapi)
-- [Text Generation WebUI](https://github.com/oobabooga/text-generation-webui)
-- [SillyTavern](https://github.com/SillyTavern/SillyTavern)
-
-## Developers and Permissions
-
-Creators/Developers:
-
-- [kingbri](https://github.com/bdashore3)
-
-- [Splice86](https://github.com/Splice86)
-
-- [Turboderp](https://github.com/turboderp)
+You can select the initial model but due to tabby/Exllama switching models won't free up the memory.