Models can be loaded and unloaded via the API. Also add authentication to use the API and for administrator tasks. Both types of authorization use different keys. Also fix the unload function to properly free all used vram. Signed-off-by: kingbri <bdashore3@proton.me>
14 lines
306 B
YAML
14 lines
306 B
YAML
# Network options
|
|
network:
|
|
host: "0.0.0.0"
|
|
port: 8012
|
|
# Only used if you want to initially load a model
|
|
model:
|
|
model_dir: "D:/models"
|
|
model_name: "airoboros-mistral2.2-7b-exl2"
|
|
max_seq_len: 4096
|
|
gpu_split: "auto"
|
|
rope_scale: 1.0
|
|
rope_alpha: 1.0
|
|
no_flash_attention: False
|
|
low_mem: False
|