Model: Add tensor_parallel_backend option

This allows for users to use nccl or native depending on the GPU setup.
NCCL is only available with Linux built wheels.

Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
This commit is contained in:
kingbri 2025-08-17 21:42:30 -04:00
parent b9952f319e
commit 43f9483bc4
4 changed files with 26 additions and 2 deletions

View file

@ -87,6 +87,12 @@ model:
# This ignores the gpu_split_auto value.
tensor_parallel: false
# Sets a backend type for tensor parallelism. (default: native).
# Options: native, nccl
# Native is recommended for PCIe GPUs
# NCCL is recommended for NVLink.
tensor_parallel_backend: native
# Automatically allocate resources to GPUs (default: True).
# Not parsed for single GPU users.
gpu_split_auto: true