Docs: Sampler overrides part 2
Actually commit the edits. Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
This commit is contained in:
parent
86f27c9c93
commit
1344726936
1 changed files with 36 additions and 137 deletions
|
|
@ -1,150 +1,49 @@
|
|||
# Supported Samplers
|
||||
# Overview
|
||||
|
||||
Samplers are used to alter raw probabilities during response generations. Users can tune these to adjust what outputs they get.
|
||||
Clients can send sampler parameters that users are unaware of, or even not send any parameters at all!
|
||||
|
||||
Sampler overrides is tabbyAPI's flexible proxy for adding or forcing sampler values before generation.
|
||||
# Setting up
|
||||
|
||||
All supported samplers are located in `sampler_overrides/sample_preset.yml`.
|
||||
|
||||
> [!NOTE]
|
||||
>
|
||||
> Sampling is not a catch-all solution if your generations are behaving the wrong way! These factors can also fall to the prompt, frontend, model, etc. Please do not set arbitrary sampler values without understanding what they do first!
|
||||
> Sampler overrides can also be switched at runtime via the `/v1/sampling/override` endpoint set. Please read the [API docs](https://theroyallab.github.io/tabbyAPI/#operation/list_sampler_overrides_v1_sampling_override_list_get) for more information.
|
||||
|
||||
## Penalties
|
||||
Steps:
|
||||
1. Create a new YML in the `sampler_overrides` folder or duplicate `sample_preset.yml`
|
||||
2. Adjust the samplers you want to override using the examples below
|
||||
3. Save and rename the file (if you haven't already)
|
||||
4. Open `config.yml` and inside the `sampling` block, set the `sampler_override` key to the preset name from step 3
|
||||
# Examples
|
||||
|
||||
Repetition Penalty -
|
||||
Let's say a client doesn't send a value for `top_p`:
|
||||
|
||||
- API request field: `repetition_penalty`
|
||||
|
||||
- Default: `1.0` - Off
|
||||
|
||||
- Description: Multiplicative method of preventing repetition of previous tokens in the context.
|
||||
|
||||
```yml
|
||||
top_p:
|
||||
override: 0.7
|
||||
force: False
|
||||
```
|
||||
|
||||
Frequency Penalty -
|
||||
This override will change the default fallback of top_p from a neutral value of 1.0 to 0.7.
|
||||
|
||||
- API request field: `frequency_penalty`
|
||||
|
||||
- Default: `0.0` - Off
|
||||
|
||||
- Description: A constant value added each time each time a token is sampled, reducing the probability for that specific token.
|
||||
|
||||
Now, let's say a client forces a value that you don't like (ex. `top-k: 30`):
|
||||
|
||||
Presence Penalty -
|
||||
```yml
|
||||
top_k:
|
||||
override: 0
|
||||
force: True
|
||||
```
|
||||
|
||||
- API request field: `presence_penalty`
|
||||
|
||||
- Default: `0.0` - Off
|
||||
|
||||
- Description: Additive method of preventing repetition of previous tokens in the context. Encourages new ideas to get generated. Unlike frequency penalty, this is a one-off application. tldr; repetition penalty, but additive.
|
||||
|
||||
This override forces top_k to be 0 no matter what the client sends.
|
||||
|
||||
Penalty Range -
|
||||
Finally, let's say a client sends a list, but you want to add something else:
|
||||
|
||||
> [!NOTE]
|
||||
>
|
||||
> Unlike other backends, `0` disables penalties entirely!
|
||||
```yml
|
||||
stop:
|
||||
override: ["a"]
|
||||
force: False
|
||||
additive: True
|
||||
```
|
||||
|
||||
- API Request: `penalty_range` or `repetition_range` or `repetition_penalty_range`
|
||||
|
||||
- Default: `-1`
|
||||
|
||||
- When frequency OR presence penalty is enabled, a penalty_range value of `-1` applies the penalty to only the output tokens. A lower range is advised.
|
||||
|
||||
- Otherwise a penalty range value of `-1` = max sequence length
|
||||
|
||||
- Description: Amount of tokens to look behind when applying penalties.
|
||||
|
||||
- For frequency and presence penalty, this should be a low value to avoid "backing the model into a corner" when selecting similar tokens, resulting in large amounts of synonym repeats (aka "thesaurus mode").
|
||||
|
||||
## Alphabet Soup
|
||||
|
||||
Top-P -
|
||||
|
||||
- API request field: `top_p`
|
||||
|
||||
- Default: `1.0` - Off
|
||||
|
||||
|
||||
Min-P -
|
||||
|
||||
- API request field: `min_p`
|
||||
|
||||
- Default: `0.0` - Off
|
||||
|
||||
|
||||
Top-K -
|
||||
|
||||
- API request field: `top_k`
|
||||
|
||||
- Default: `0.0` - Off
|
||||
|
||||
|
||||
Top-A -
|
||||
|
||||
- API request field: `top_a`
|
||||
|
||||
- Default: `0.0` - Off
|
||||
|
||||
|
||||
## Miscellaneous
|
||||
|
||||
Temperature -
|
||||
|
||||
- API request field: `temperature`
|
||||
|
||||
- Default: `1.0` - Off
|
||||
|
||||
- Description: A constant value applied to softmax calculation. A higher temperature = more randomness when choosing the next token.
|
||||
|
||||
|
||||
Temp last -
|
||||
|
||||
- API request field: `temp_last`
|
||||
|
||||
- Default: `false` - Off
|
||||
|
||||
- Description: Places temperature application last in the sampling stack. Necessary for min-P sampling.
|
||||
|
||||
|
||||
Typical -
|
||||
|
||||
- API request field: `typical`
|
||||
|
||||
- Default: `1.0` - Off
|
||||
|
||||
|
||||
Tail-free Sampling -
|
||||
|
||||
- API request field: `tfs`
|
||||
|
||||
- Default: `1.0` - Off
|
||||
|
||||
|
||||
Logit bias -
|
||||
|
||||
- API request field: `logit_bias`
|
||||
|
||||
- Default: `None` - Off
|
||||
|
||||
- Example: `[{"1": 50}, {"2": 75}]` - An array of bias objects
|
||||
|
||||
- Description: Adds a positive or negative value to change the occurrence of a specific token. Format: `{"token": bias}` where bias is from `-100` to `100`.
|
||||
|
||||
|
||||
Mirostat mode -
|
||||
|
||||
- API request field: `mirostat_mode`
|
||||
|
||||
- Default: `0` - Off
|
||||
|
||||
- Exllamav2 only applies mirostat when `mirostat mode = 2`
|
||||
|
||||
Mirostat tau -
|
||||
|
||||
- API request field: `mirostat_tau`
|
||||
|
||||
- Default: `1.5` - Off unless mirostat_mode = 2
|
||||
|
||||
|
||||
Mirostat eta -
|
||||
|
||||
- API request field: `mirostat_eta`
|
||||
|
||||
- Default: `0.1` - Off unless mirostat_mode = 2
|
||||
Now this override appends the given stop string to whatever the client sends.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue