Model: Remove num_experts_per_token
This shouldn't even be an exposed option since changing it always breaks inference with the model. Let the model's config.json handle it. Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
This commit is contained in:
parent
698d8339cb
commit
79f9c6e854
6 changed files with 0 additions and 30 deletions
|
|
@ -75,7 +75,6 @@ Note: Most of the options here will only apply on initial model load/startup (ep
|
|||
| max_batch_size | Int (None) | The absolute maximum amount of prompts to process at one time. This value is automatically adjusted based on cache size. |
|
||||
| prompt_template | String (None) | Name of a jinja2 chat template to apply for this model. Must be located in the `templates` directory. |
|
||||
| vision | Bool (False) | Enable vision support for the provided model (if it exists). |
|
||||
| num_experts_per_token | Int (None) | Number of experts to use per-token for MoE models. Pulled from the config.json if not specified. |
|
||||
|
||||
### Draft Model Options
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue