Model: Add tensor_parallel_backend option

This allows for users to use nccl or native depending on the GPU setup.
NCCL is only available with Linux built wheels.

Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
This commit is contained in:
kingbri 2025-08-17 21:42:30 -04:00
parent b9952f319e
commit 43f9483bc4
4 changed files with 26 additions and 2 deletions

View file

@ -90,6 +90,7 @@ class ModelLoadRequest(BaseModel):
examples=[4096],
)
tensor_parallel: Optional[bool] = None
tensor_parallel_backend: Optional[str] = "native"
gpu_split_auto: Optional[bool] = None
autosplit_reserve: Optional[List[float]] = None
gpu_split: Optional[List[float]] = Field(