Model: Add tensor_parallel_backend option

This allows for users to use nccl or native depending on the GPU setup. NCCL is only available with Linux built wheels. Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
2025-08-17 21:42:30 -04:00 · 2025-08-17 21:42:30 -04:00 · 43f9483bc4
commit 43f9483bc4
parent b9952f319e
4 changed files with 26 additions and 2 deletions
--- a/config_sample.yml
+++ b/config_sample.yml
@ -87,6 +87,12 @@ model:
  # This ignores the gpu_split_auto value.
  tensor_parallel: false

+  # Sets a backend type for tensor parallelism. (default: native).
+  # Options: native, nccl
+  # Native is recommended for PCIe GPUs
+  # NCCL is recommended for NVLink.
+  tensor_parallel_backend: native
+
  # Automatically allocate resources to GPUs (default: True).
  # Not parsed for single GPU users.
  gpu_split_auto: true