jalr/tabbyAPI-ollama

Author	SHA1	Message	Date
Brian Dashore	1dbebd48eb	Merge pull request #50 from djmaze/patch-1 Remove fschat from compose yaml	2024-01-06 00:10:20 -05:00
Martin Honermeyer	6ab02e1eeb	Remove fschat from compose yaml fschat has been removed from the Dockerfile a while ago.	2024-01-06 02:18:26 +01:00
kingbri	81b504e8c5	OAI: Fix typical alias AliasChoices takes strings, not an array. Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-05 16:38:39 -05:00
kingbri	2c57dafc59	OAI: Add alias for typical sampling Typical can also be called typical_p Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-05 15:29:53 -05:00
kingbri	d4ed9f703d	Tree: Format Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-04 21:13:30 -05:00
kingbri	c1642076c2	API: Switch unload method to POST GET and POST can be used interchangeably in this case, but adhere to the HTTP spec. Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-04 21:11:36 -05:00
kingbri	cd4bf99598	OAI: Fix autodoc examples for model loading Some values weren't defaulting to correct values. Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-04 20:53:56 -05:00
kingbri	ceb388e8a0	Start: Override ROCm env variables These are used for supporting GPUs that are not on the "officially supported list". Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-02 21:01:18 -05:00
Brian Dashore	c980f35e1b	Merge pull request #47 from Baysul/patch-1 Only try to install one of the EXLv2 wheels	2024-01-02 20:58:59 -05:00
Basil	2460b2f8ef	Only try to install one of the EXLv2 wheels ...depending on Python version.	2024-01-02 16:56:39 -08:00
kingbri	451042aadf	Main: Don't load if model_name/loras is blank Previously, if model_name was commented out, a load would not occur. Add the case if model_name or loras is blank which returns None when parsing the YAML. Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-02 13:56:25 -05:00
kingbri	6b04463051	API: Fix CFG reporting THe model endpoint wasn't reporting if CFG is on. Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-02 13:54:16 -05:00
kingbri	bbd4ee54ca	Model: Add fallback if negative prompt is empty Fallback to the BOS token since an empty string won't do anything. Ideally, an empty negative prompt should not be used, but it's not the end of the world. Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-02 01:46:51 -05:00
kingbri	b378773d0a	Model: Add CFG support CFG, or classifier-free guidance helps push a model in different directions based on what the user provides. Currently, CFG is ignored if the negative prompt is blank (it shouldn't be used in that way anyways). Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-02 01:46:51 -05:00
kingbri	bb7a8e4614	Config: Add override argparser Add an argparser that casts over to dictionaries of subgroups to integrate with the config. This argparser doesn't contain everything in the config due to complexity issues with CLI args, but will eventually progress to parity. In addition, it's used to override the config.yml rather than replace it. A config arg is also provided if the user wants to fully override the config yaml with another file path. Signed-off-by: kingbri <bdashore3@proton.me>	2024-01-01 14:27:12 -05:00
kingbri	7176fa66f0	Update README Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-31 11:25:18 -05:00
kingbri	979a9d28a3	Tree: Format Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-31 11:22:18 -05:00
kingbri	528d20ca5b	Update README Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-31 11:21:13 -05:00
kingbri	72bc30343c	Model: Fix frequency penalty fallback The appropriate branches weren't firing when frequency penalty is 0.0. Also fix repetition penalty overriding. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-31 11:21:07 -05:00
kingbri	47744fe9f7	Update README Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-31 01:48:10 -05:00
kingbri	0dc12d82d5	Model: Add fallback for freq and presence pen Previous behavior aliased freq pen for rep pen. Keep this behavior when using the freq pen parameter with a legacy exllamav2 version rather than ignoring both entirely. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-30 00:24:15 -05:00
kingbri	79a57588d5	API: Add template list endpoint Fetches all template names that a user has in the templates directory for chat completions. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-29 22:58:55 -05:00
kingbri	dce8c74edc	API: Add clarification and cleanup autodocs It's possible to override parts of the example JSON to give proper examples of values. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-29 10:28:06 -05:00
kingbri	4136f19058	Config: Make the sample a drop-in solution With the new wiki, all parameters are fully documented along with comments in the YAML file itself. This should help new users who pull, copy the config, and can't start the API due to subsections being uncommented and read. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-29 01:36:21 -05:00
kingbri	ec929728d9	Model: Read scale_pos_emb from config In newer versions of exllamav2, this value is read from the model's config.json. This value will still default to 1.0 anyways. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-28 21:14:24 -05:00
city-unit	e70729b0c0	Update Docker Squash commit that merges #43, #44, and #45 Create .dockerignore Make compose marginally better Un-scuffed the Dockerfile	2023-12-28 18:26:04 -05:00
kingbri	5dc2df68be	Model: Repetition penalty range -> penalty range All penalties can have a sustain (range) applied to them in exl2, so clarify the parameter. However, the default behaviors change based on if freq OR pres pen is enabled. For the sanity of OAI users, have freq and pres pen only apply on the output tokens when range is -1 (default). But, repetition penalty still functions the same way where -1 means the range is the max seq len. Doing this prevents gibberish output when using the more modern freq and presence penalties similar to llamacpp. NOTE: This logic is still subject to change in the future, but I believe it hits the happy medium for users who want defaults and users who want to tinker around with the sampling knobs. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-28 18:16:10 -05:00
kingbri	c72d30918c	Config: Default None -> Empty in comments Empty makes more sense when talking about empty fields. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-28 00:32:29 -05:00
kingbri	f56221ff0c	Tree: Format Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-28 00:31:59 -05:00
kingbri	3622710582	API: Fix num_experts_per_token reporting This wasn't linked to the model config. This value can be 1 if a MoE model isn't loaded. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-28 00:31:14 -05:00
kingbri	c5bbfd97b2	Entrypoint: Load loras after model Prevents an error if the model isn't loaded on startup. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-27 23:55:02 -05:00
kingbri	ee84d892b8	Start: Add shell script Same as the batch file. Also edit the python script to work when a venv is clean. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-27 23:53:14 -05:00
kingbri	ac0d6f8869	Tree: Format and cleanup start Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-27 01:17:31 -05:00
kingbri	4d83d1aae4	Start: Switch to python script Direct python can be used for requirements checking. Remove the ps1 script and create a venv purely in batch. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-27 00:37:53 -05:00
kingbri	a71b96a20c	Main: Switch to entrypoint Allows for other modules to access the startup function. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-27 00:34:50 -05:00
kingbri	e92ef8f5c7	OAI: Fix rep pen range alias No need to unwrap because the Pydantic alias does that for us. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-25 15:37:11 -05:00
kingbri	7b74cb28e6	Model: Move unsupported sampler check Overbloated the generation function. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-25 15:29:51 -05:00
kingbri	e256ff8182	Samplers: Add frequency and presence penalty Un-alias repetition penalty from the frequency penalty parameter. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-25 15:27:32 -05:00
kingbri	442bb59f8f	Tests: Remove logger class The logger module could not be found when calling the test. Re-add the color logging at a later time. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-25 15:20:39 -05:00
kingbri	162c13752a	Requirements: Update to Flash Attention 2.4.1 Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-25 14:40:08 -05:00
kingbri	5c08316d18	Start: Switch to Write-Host Write-Output is equal to a return statement and breaks parts of the script. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-25 11:59:58 -05:00
kingbri	670ccac19a	Start: Add option to not install wheels Building from source is a case for many wheels, so add an option to skip wheel upgrades/installation if the user uses the start script. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-25 11:49:56 -05:00
kingbri	09ae71aa91	OAI: Add finish to completions OAI spec requires [DONE] to be sent over SSE to signal that a generation is completed. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-25 11:25:38 -05:00
kingbri	cc3229c109	Scripts: Make Start.bat idiotproof Start now creates a venv, installs the correct requirements, and starts the API. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-24 20:50:24 -05:00
kingbri	060d422e03	Config: Resolve filepath This maps the absolute path when loading the config file. Making things safer when loading and finding the correct path. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-23 23:57:33 -05:00
kingbri	703a114f63	Tree: Format Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-23 23:03:28 -05:00
kingbri	c9126c3145	Config: Isolate to a separate file Reduce dependency of globals in main to simplify code a bit. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-23 23:02:37 -05:00
kingbri	0d2e726e82	Main: Fix import formatting Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-23 21:33:15 -05:00
kingbri	3461f8294f	Logging: Clarify preferences Preferences are preferences, not a config. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-23 21:08:10 -05:00
kingbri	98a7b951b9	Logging: Add newlines to Prompt and Response Makes things clearer rather than adding an extra space. Signed-off-by: kingbri <bdashore3@proton.me>	2023-12-22 23:55:22 -05:00

... 7 8 9 10 11 ...

641 commits