API: Fix num_experts_per_token reporting

This wasn't linked to the model config. This value can be 1 if
a MoE model isn't loaded.

Signed-off-by: kingbri <bdashore3@proton.me>
This commit is contained in:
kingbri 2023-12-28 00:31:14 -05:00
parent c5bbfd97b2
commit 3622710582
2 changed files with 2 additions and 1 deletions

View file

@ -77,7 +77,7 @@ model:
# NOTE: Only works with chat completion message lists!
prompt_template:
# Number of experts to use per token. Loads from the model's config.json if not specified (default: None)
# Number of experts to use PER TOKEN. Fetched from the model's config.json if not specified (default: None)
# WARNING: Don't set this unless you know what you're doing!
# NOTE: For MoE models (ex. Mixtral) only!
num_experts_per_token: