tabbyAPI-ollama/common/concurrency.py
kingbri 7fded4f183 Tree: Switch to async generators
Async generation helps remove many roadblocks to managing tasks
using threads. It should allow for abortables and modern-day paradigms.

NOTE: Exllamav2 itself is not an asynchronous library. It's just
been added into tabby's async nature to allow for a fast and concurrent
API server. It's still being debated to run stream_ex in a separate
thread or manually manage it using asyncio.sleep(0)

Signed-off-by: kingbri <bdashore3@proton.me>
2024-03-16 23:23:31 -04:00

34 lines
856 B
Python

"""Concurrency handling"""
import asyncio
import inspect
from functools import partialmethod
from typing import AsyncGenerator, Generator, Union
generate_semaphore = asyncio.Semaphore(1)
def release_semaphore():
generate_semaphore.release()
async def generate_with_semaphore(generator: Union[AsyncGenerator, Generator]):
"""Generate with a semaphore."""
async with generate_semaphore:
if inspect.isasyncgenfunction:
async for result in generator():
yield result
else:
for result in generator():
yield result
async def call_with_semaphore(callback: partialmethod):
"""Call with a semaphore."""
async with generate_semaphore:
if inspect.iscoroutinefunction(callback):
return await callback()
else:
return callback()