Model: Add tensor_parallel_backend option
This allows for users to use nccl or native depending on the GPU setup. NCCL is only available with Linux built wheels. Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
This commit is contained in:
parent
b9952f319e
commit
43f9483bc4
4 changed files with 26 additions and 2 deletions
|
|
@ -90,6 +90,7 @@ class ModelLoadRequest(BaseModel):
|
|||
examples=[4096],
|
||||
)
|
||||
tensor_parallel: Optional[bool] = None
|
||||
tensor_parallel_backend: Optional[str] = "native"
|
||||
gpu_split_auto: Optional[bool] = None
|
||||
autosplit_reserve: Optional[List[float]] = None
|
||||
gpu_split: Optional[List[float]] = Field(
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue