Migrate OpenAPI and sample config export to subcommands "export-openapi"
and "export-config".
Also add a "download" subcommand that passes args to the TabbyAPI
downloader. This allows models to be downloaded via the API and
CLI args.
Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
Since the full argparser requires pydantic, gate it until all dependencies
are installed.
Also if the venv is deleted, assume that start_options.json is invalid
as well.
Signed-off-by: kingbri <bdashore3@proton.me>
Remove access of private attributes and use safer functions. Also
move generalized functions into utils files.
Signed-off-by: kingbri <bdashore3@proton.me>
Using "auto" for rope alpha removes ambiguity on how to explicitly
enable automatic rope calculation. The same behavior of None -> auto
calculate still exists, but can be overwritten if a model's tabby_config.yml
includes `rope_alpha`.
Signed-off-by: kingbri <bdashore3@proton.me>
Use the tensor parallel loader when the flag is enabled. The new loader
has its own autosplit implementation, so gpu_split_auto isn't valid
here.
Also make it easier to determine which cache type to use rather than
multiple if/else statements.
Signed-off-by: kingbri <bdashore3@proton.me>
These are faster event loops for asyncio which should improve overall
performance. Gate these under an experimental flag for now to stress
test these loops.
Signed-off-by: kingbri <bdashore3@proton.me>
This option saves some VRAM, but does have the chance to error out.
Add this in the experimental config section.
Signed-off-by: kingbri <bdashore3@proton.me>
Many APIs automatically ask for request streaming without giving
the user the option to turn it off. Therefore, give the user more
freedom by giving a server-side kill switch.
Signed-off-by: kingbri <bdashore3@proton.me>
Add the ability to use an unsafe config flag if needed and migrate
the exl2 check to a different file within the exl2 backend code.
Signed-off-by: kingbri <bdashore3@proton.me>
Move common functions into their own folder and refactor the backends
to use their own folder as well.
Also cleanup imports and alphabetize import statments themselves.
Finally, move colab and docker into their own folders as well.
Signed-off-by: kingbri <bdashore3@proton.me>