tabbyAPI-ollama/common
kingbri efc01d947b API + Model: Add speculative ngram decoding
Speculative ngram decoding is like speculative decoding without the
draft model. It's not as useful because it only decodes on predictable
sequences, but it depends on the usecase.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-03-13 23:32:11 -04:00
..
args.py Tree: Format 2024-03-13 00:02:55 -04:00
auth.py Tree: Format 2024-03-13 00:02:55 -04:00
config.py Tree: Update to cleanup globals 2024-03-12 23:59:30 -04:00
gen_logging.py Logging: Move metrics to gen logging 2024-03-13 23:13:55 -04:00
generators.py Generation: Explicitly release semaphore on disconnect 2024-03-10 17:54:48 -04:00
logger.py Logging: Escape rich markup sequences 2024-03-11 00:28:48 -04:00
model.py API: Split functions into their own files 2024-03-12 23:59:30 -04:00
sampling.py API + Model: Add speculative ngram decoding 2024-03-13 23:32:11 -04:00
templating.py Tree: Format 2024-03-13 00:02:55 -04:00
utils.py Startup: Check if the port is available and fallback 2024-03-11 21:57:28 -04:00