tabbyAPI-ollama

History

kingbri 09a4c79847 Model: Auto-scale max_tokens by default If max_tokens is None, it automatically scales to fill up the context. This does not mean the generation will fill up that context since EOS stops also exist. Originally suggested by #86 Signed-off-by: kingbri <bdashore3@proton.me>		2024-03-18 22:54:59 -04:00
..
args.py	Tree: Format	2024-03-13 00:02:55 -04:00
auth.py	API: Cleanup permission endpoint	2024-03-18 15:13:26 -04:00
concurrency.py	API: Fix blocking iterator execution	2024-03-16 23:23:31 -04:00
config.py	Tree: Update to cleanup globals	2024-03-12 23:59:30 -04:00
gen_logging.py	Tree: Format	2024-03-13 23:33:18 -04:00
logger.py	Logging: Escape rich markup sequences	2024-03-11 00:28:48 -04:00
model.py	Model: Fix load if model didn't load properly	2024-03-16 23:23:31 -04:00
sampling.py	Model: Auto-scale max_tokens by default	2024-03-18 22:54:59 -04:00
signals.py	Signal: Fix signal handlers for uvicorn	2024-03-16 23:23:31 -04:00
templating.py	Tree: Format	2024-03-13 00:02:55 -04:00
utils.py	Tree: Switch to async generators	2024-03-16 23:23:31 -04:00