jalr/tabbyAPI-ollama

Author	SHA1	Message	Date
DocShotgun	a1df22668b	API: Add min_tokens Bans the EOS token until the generation reaches a minimum length. This will not prevent the model from otherwise ending the generation early by outputting other stop conditions.	2024-05-10 12:30:17 -07:00
kingbri	ab526f7278	Revert "API: Remove unncessary Optional signatures" This reverts commit `7556dcf134`. The Optionals allowed requests to send "null" in the body for optional parameters which should be allowed. Signed-off-by: kingbri <bdashore3@proton.me>	2024-05-02 21:23:48 -04:00
kingbri	7556dcf134	API: Remove unncessary Optional signatures Optional isn't necessary if the function signature has a default value. Signed-off-by: kingbri <bdashore3@proton.me>	2024-05-01 00:04:52 -04:00
kingbri	6114bfd221	API: Fix banned_tokens string when empty The string should not be parsed and any non-string elements should be removed as well. Signed-off-by: kingbri <bdashore3@proton.me>	2024-04-28 12:46:28 -04:00
kingbri	6f9da97114	API: Add banned_tokens Appends the banned tokens to the generation. This is equivalent of setting logit bias to -100 on a specific set of tokens. Signed-off-by: kingbri <bdashore3@proton.me>	2024-04-28 11:06:09 -04:00
kingbri	9f93505bc1	OAI: Add skip_special_tokens parameter Allows the ability to decode special tokens if the user wishes. Signed-off-by: kingbri <bdashore3@proton.me>	2024-04-21 00:37:46 -04:00
kingbri	d716527b92	Sampling: Add additive param to overrides Additive is used to add collections together. Currently, it's used for lists, but it can be used for dictionaries in the future. Signed-off-by: kingbri <bdashore3@proton.me>	2024-03-31 01:10:55 -04:00
kingbri	09a4c79847	Model: Auto-scale max_tokens by default If max_tokens is None, it automatically scales to fill up the context. This does not mean the generation will fill up that context since EOS stops also exist. Originally suggested by #86 Signed-off-by: kingbri <bdashore3@proton.me>	2024-03-18 22:54:59 -04:00
kingbri	efc01d947b	API + Model: Add speculative ngram decoding Speculative ngram decoding is like speculative decoding without the draft model. It's not as useful because it only decodes on predictable sequences, but it depends on the usecase. Signed-off-by: kingbri <bdashore3@proton.me>	2024-03-13 23:32:11 -04:00
kingbri	5a2de30066	Tree: Update to cleanup globals Use the module singleton pattern to share global state. This can also be a modified version of the Global Object Pattern. The main reason this pattern is used is for ease of use when handling global state rather than adding extra dependencies for a DI parameter. Signed-off-by: kingbri <bdashore3@proton.me>	2024-03-12 23:59:30 -04:00
kingbri	228c227c1e	Logging: Switch to loguru Loguru is a flexible logger that allows for easier hooking and imports into Rich with no problems. Also makes progress bars stick to the bottom of the terminal window. Signed-off-by: kingbri <bdashore3@proton.me>	2024-03-08 01:00:48 -05:00
kingbri	f6d749c771	Model: Add EBNF grammar support Using the Outlines library, add support to supply EBNF strings and pass them to the library for parsing. From there, a wrapper is created and a filter is passed to generation. Replace with an in-house solution at some point that's more flexible. Signed-off-by: kingbri <bdashore3@proton.me>	2024-02-24 23:40:11 -05:00
kingbri	57b3d69949	API + Model: Add support for JSON schema constraints Add the ability to constrain the return value of a model to be JSON. Built using the JSON schema standard to define the properties of what the model should return. This feature should be more accurate than using GBNF/EBNF to yield the same results due to the use of lmformatenforcer. GBNF/EBNF will be added in a different commit/branch. Signed-off-by: kingbri <bdashore3@proton.me>	2024-02-24 23:40:11 -05:00
kingbri	7def32e4de	Model: Fix logit bias handling If the token doesn't exist, gracefully warn instead of erroring out. Signed-off-by: kingbri <bdashore3@proton.me>	2024-02-18 18:30:58 -05:00
kingbri	a79c42ff4c	Sampling: Make validators simpler Injecting into Pydantic fields caused issues with serialization for documentation rendering. Rather than reinvent the wheel again, switch to a chain of if statements for now. This may change in the future if subclasses from the base sampler request need to be validated as well. Signed-off-by: kingbri <bdashore3@proton.me>	2024-02-11 15:28:43 -05:00
kingbri	7e730e3507	Sampling: Add universal validation system Rather than maintaining yet another function to validate sampler ranges/values, embed them in fields which allows for less maintainence in the future. Also add validation for existing samplers that can corrupt the sampling stack if set improperly. Signed-off-by: kingbri <bdashore3@proton.me>	2024-02-10 14:59:23 -05:00
kingbri	0af6a38af3	Model: Add logprobs support Returns token offsets, selected tokens, probabilities of tokens post-sampling, and normalized probability of selecting a token pre-sampling (for efficiency purposes). Only for text completions. Chat completions in a later commit. Signed-off-by: kingbri <bdashore3@proton.me>	2024-02-08 21:26:53 -05:00
AliCat	bb48f77ca1	Neutralize samplers (#59 ) * Update sample_preset.yml Neutralized the samplers. * Sampling: Fix dynatemp defaults Default max temp and min temp is 1.0 * Sampling: Fix TFS defaults Default is 1.0 --------- Co-authored-by: AliCat <86847834+alicat22@users.noreply.github.com> Co-authored-by: kingbri <bdashore3@proton.me>	2024-02-08 00:23:09 -05:00
erinmaybe	fa2acb2828	Adds aliases for min_temp and max_temp (#58 ) * Adds aliases for min_temp and max_temp * Sampling: Add dynatemp_exponent alias	2024-02-03 21:51:29 -05:00
kingbri	b827bcbb44	Sampling: Cleanup and update Cleanup how overrides are handled, class naming, and adopt exllamav2's model class to enforce latest stable version methods rather than adding multiple backwards compatability checks. Signed-off-by: kingbri <bdashore3@proton.me>	2024-02-02 23:36:17 -05:00
kingbri	634d299fd9	Sampling: Fix smoothing factor default fallback default_factory, not default_factor Signed-off-by: kingbri <bdashore3@proton.me>	2024-02-02 23:35:15 -05:00
Alexander Abushady	d7c18855e7	added quadratic sampling (#56 ) * added quadratic sampling * Update sample_preset.yml * oops missed a spot * Sampling: Fix smoothing factor semantics	2024-02-02 22:12:59 -05:00
kingbri	4a7b8b1b7a	Samplers: Add dynamic temperature Does not work if max_temp is less than or equal to min_temp. Sampler validation will have to be refactored in the future, so the dynamic temperature check will also be changed. Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-31 01:20:59 -05:00
kingbri	fc4570220c	API + Model: Add new parameters and clean up documentation The example JSON fields were changed because of the new sampler default strategy. Fix these by manually changing the values. Also add support for fasttensors and expose generate_window to the API. It's recommended to not adjust generate_window as it's dynamically scaled based on max_seq_len by default. Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-25 00:15:40 -05:00
kingbri	b14c5443fd	API: Add sampler override switching Allow users to switch the currently overriden samplers via the API so a restart isn't required to switch the overrides. Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-25 00:15:40 -05:00
kingbri	6c30f24c83	Tree: Unify sampler parameters and add override support Unify API sampler params into a superclass which should make them easier to manage and inherit generic functions from. Not all frontends expose all sampling parameters due to connections with OAI (that handles sampling themselves with the exception of a few sliders). Add the ability for the user to customize fallback parameters from server-side. In addition, parameters can be forced to a certain value server-side in case the repo automatically sets other sampler values in the background that the user doesn't want. Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-25 00:15:40 -05:00

26 commits