Commit graph

20 commits

Author SHA1 Message Date
DocShotgun
998abe5ad1 Config: Enable safe sampler overrides by default
* Provides safe fallback samplers, intended for better out-of-the-box support for clients that do not pass sampler params
2025-08-18 12:32:28 -07:00
kingbri
6379081dd8 Sampling: Make add_bos_token override concise
Also set the default to None so text completions follows the same
pattern.

Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
2025-05-10 19:07:35 -04:00
kingbri
42346c6b39 Sampling: Remove skip_special_tokens
This parameter is way too confusing and does not make sense in
the modern LLM space.

Change approved by all maintainers.

Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
2025-05-09 22:11:33 -04:00
kingbri
1afc9b983e Model: Remove generate_window
Not required since we error with exceeding the max_seq_len

Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>
2025-04-16 12:59:02 -04:00
kingbri
56ce82ef77 Sampling: Add XTC support
Matches with upstream.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-24 18:10:52 -04:00
kingbri
9c4a0e650f Sampling: Fix override for DRY sequence breakers
The common type should be an array of strings.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-07 21:38:50 -04:00
kingbri
4f5ca7a4c7 Sampling: Update overrides and params
Re-order to make more sense.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-09-07 12:48:59 -04:00
kingbri
21712578cf API: Add allowed_tokens support
This is the opposite of banned tokens. Exllama specific implementation
of #181.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-08-29 21:44:42 -04:00
DocShotgun
dad34237ba Samplers: Add example override for generate_window 2024-05-12 00:39:01 -07:00
DocShotgun
9463ecfa40 Samplers: Minor fixes for sampler override
* Add missing settings to sample_preset.yml
* Fix override for skip_special_tokens
2024-05-12 00:31:31 -07:00
kingbri
c8ec742be9 Samplers: Expose skew sampling
Skew is an extra unused sampler in ExllamaV2. Add it in for coverage.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-05-12 01:41:01 -04:00
DocShotgun
c0b631ba92 API: Add banned_strings
From exllamav2: List of strings that the generator will refuse to output. As soon as a partial match happens, a checkpoint is saved that the generator can rewind to if need be. Subsequent tokens are then held until the full string is resolved (match or no match) and either emitted or discarded, accordingly.
2024-05-10 13:53:55 -07:00
DocShotgun
a1df22668b API: Add min_tokens
Bans the EOS token until the generation reaches a minimum length. This will not prevent the model from otherwise ending the generation early by outputting other stop conditions.
2024-05-10 12:30:17 -07:00
kingbri
d716527b92 Sampling: Add additive param to overrides
Additive is used to add collections together. Currently, it's used
for lists, but it can be used for dictionaries in the future.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-03-31 01:10:55 -04:00
kingbri
efc01d947b API + Model: Add speculative ngram decoding
Speculative ngram decoding is like speculative decoding without the
draft model. It's not as useful because it only decodes on predictable
sequences, but it depends on the usecase.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-03-13 23:32:11 -04:00
AliCat
bb48f77ca1
Neutralize samplers (#59)
* Update sample_preset.yml

Neutralized the samplers.

* Sampling: Fix dynatemp defaults

Default max temp and min temp is 1.0

* Sampling: Fix TFS defaults

Default is 1.0

---------

Co-authored-by: AliCat <86847834+alicat22@users.noreply.github.com>
Co-authored-by: kingbri <bdashore3@proton.me>
2024-02-08 00:23:09 -05:00
Alexander Abushady
d7c18855e7
added quadratic sampling (#56)
* added quadratic sampling

* Update sample_preset.yml

* oops missed a spot

* Sampling: Fix smoothing factor semantics
2024-02-02 22:12:59 -05:00
kingbri
4a7b8b1b7a Samplers: Add dynamic temperature
Does not work if max_temp is less than or equal to min_temp. Sampler
validation will have to be refactored in the future, so the dynamic
temperature check will also be changed.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-01-31 01:20:59 -05:00
kingbri
fc4570220c API + Model: Add new parameters and clean up documentation
The example JSON fields were changed because of the new sampler
default strategy. Fix these by manually changing the values.

Also add support for fasttensors and expose generate_window to
the API. It's recommended to not adjust generate_window as it's
dynamically scaled based on max_seq_len by default.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-01-25 00:15:40 -05:00
kingbri
6c30f24c83 Tree: Unify sampler parameters and add override support
Unify API sampler params into a superclass which should make them
easier to manage and inherit generic functions from.

Not all frontends expose all sampling parameters due to connections
with OAI (that handles sampling themselves with the exception of
a few sliders).

Add the ability for the user to customize fallback parameters from
server-side.

In addition, parameters can be forced to a certain value server-side
in case the repo automatically sets other sampler values in the
background that the user doesn't want.

Signed-off-by: kingbri <bdashore3@proton.me>
2024-01-25 00:15:40 -05:00