Commit graph

  • 39735063dd Update README.md main JZ273 2024-10-20 11:20:06 -04:00
  • 4ace973244 formatting changes John 2024-10-13 07:27:27 -04:00
  • 1fd4a9e119 enabled api routes to work with open-webui John 2024-10-12 20:14:57 -04:00
  • 1d3a308709
    Fix wiki link in README.md turboderp 2025-08-26 13:03:18 +02:00
  • d7eb580e99 Start: Fix uv check kingbri 2025-08-21 18:23:42 -04:00
  • 4036c70d75 Tree: Format kingbri 2025-08-19 22:58:25 -04:00
  • bd3aa5bb04 Docs: Add uv section kingbri 2025-08-19 22:54:27 -04:00
  • 1f4186512e Start: Add check for uv kingbri 2025-08-19 22:49:13 -04:00
  • 30a3cd75cf Start: Migrate options from cu121/118 to cu12 kingbri 2025-08-19 22:25:30 -04:00
  • 1344726936 Docs: Sampler overrides part 2 kingbri 2025-08-19 21:19:12 -04:00
  • 86f27c9c93
    Merge pull request #377 from DocShotgun/main Brian 2025-08-18 23:12:34 -04:00
  • e07df3951e Docs: Update sampler overrides kingbri 2025-08-18 23:06:16 -04:00
  • 067d63773e Config: Move sampling higher in the list kingbri 2025-08-18 22:55:03 -04:00
  • 6fb0c2cdbd Config: Update description for override_preset default * We provide safe_defaults as a default in config_sample.yml but not internally DocShotgun 2025-08-18 12:39:52 -07:00
  • 998abe5ad1 Config: Enable safe sampler overrides by default * Provides safe fallback samplers, intended for better out-of-the-box support for clients that do not pass sampler params DocShotgun 2025-08-18 12:32:28 -07:00
  • a4d02c2b70 Model: Add log messages for model loading kingbri 2025-08-17 23:09:27 -04:00
  • a3a32c30a4 Model: Add utils file kingbri 2025-08-17 22:43:19 -04:00
  • 05791a25a1
    Merge pull request #375 from Ph0rk0z/patch-1 Brian 2025-08-17 22:37:25 -04:00
  • 43f9483bc4 Model: Add tensor_parallel_backend option kingbri 2025-08-17 21:42:30 -04:00
  • b9952f319e Merge branch 'main' into exl3-tp kingbri 2025-08-17 21:21:40 -04:00
  • f2a39e3a61 Dependencies: Update exllama, torch, and flash attention kingbri 2025-08-17 21:19:23 -04:00
  • 60ae419746
    Model.py TP changes Forkoz 2025-08-12 21:01:54 +00:00
  • 6623dbcd86
    Merge pull request #373 from AUTOMATIC1111/exl3-logprobs Brian 2025-08-05 01:24:06 -04:00
  • fe149489af Tree: Format kingbri 2025-08-05 01:22:18 -04:00
  • 83f778db2d
    Merge pull request #374 from DocShotgun/main Brian 2025-08-05 01:18:25 -04:00
  • 81a115b781 Templating: Support chat_template.jinja DocShotgun 2025-08-03 16:10:08 -07:00
  • 056527ceb3 add logprobs support for exl3 AUTOMATIC 2025-08-03 11:42:32 +03:00
  • 03d72a37be
    Merge pull request #371 from DocShotgun/main Brian 2025-08-01 14:02:57 -04:00
  • 102af306e5 Config: Remove developer arg cuda_malloc_backend * cudaMallocAsync is now enabled by default on supported configurations DocShotgun 2025-08-01 10:59:13 -07:00
  • 113643c0df Main: Enable cudaMallocAsync backend by default kingbri 2025-07-27 22:29:46 -04:00
  • 0b4ca567f8 API: Persist request IDs and append full_text to finish chunk kingbri 2025-07-24 22:28:40 -04:00
  • e77fa0b7a8 Docs: Edit inline loading for breaking changes kingbri 2025-07-24 18:11:42 -04:00
  • ab04a6ed60 Dependencies: Bump ExllamaV3 kingbri 2025-07-18 22:56:35 -04:00
  • bf936f5c39 Dependencies: Update exllamav2 kingbri 2025-07-13 23:33:12 -04:00
  • 2419d2d0a3
    Merge pull request #364 from theroyallab/tool-calls Brian 2025-07-11 11:34:10 -04:00
  • 707d005aad API: Default tool call ID and type kingbri 2025-07-11 01:11:09 -04:00
  • 5b1db3ad83 API: Don't do a second re-render when tool calling kingbri 2025-07-06 11:32:36 -04:00
  • 3dfa965019 API: Add tool_call_id for role = tool kingbri 2025-07-05 21:52:26 -04:00
  • 1c3f84151f Docs: Update tool calling kingbri 2025-07-05 21:43:04 -04:00
  • 871f71c4e7 Templates: Adjust tool call example kingbri 2025-07-05 21:24:30 -04:00
  • 879f4cee7e API: Modify tool calling for wider compat kingbri 2025-07-05 14:28:12 -04:00
  • b6a26da50c API: Fix tool call serialization kingbri 2025-07-04 15:02:49 -04:00
  • d23fefbecd API + Model: Fix application of defaults kingbri 2025-07-03 14:37:34 -04:00
  • d339139fb6 Config: Deep merge model overrides kingbri 2025-07-03 12:17:09 -04:00
  • 0152a1665b Downloader: Switch to use API sizes kingbri 2025-06-30 12:49:53 -04:00
  • 03ff4c3128 Downloader: Handle if Content-Length is undefined kingbri 2025-06-30 11:43:22 -04:00
  • 0ae878712e Exl3: Clear image embedding cache on unload turboderp 2025-06-25 23:56:21 +02:00
  • e362319a4d
    Merge pull request #358 from theroyallab/breaking Brian 2025-06-17 23:10:16 -04:00
  • a02d39de31 Model: Remove rogue print kingbri 2025-06-17 23:09:07 -04:00
  • 2913ce29fc API: Add timings to usage stats kingbri 2025-06-17 22:54:51 -04:00
  • 5d94d4d022 Merge branch 'main' into breaking kingbri 2025-06-17 22:24:32 -04:00
  • 122d87ac36 Tree: Format turboderp 2025-06-15 19:33:14 +02:00
  • 21c5af48e1 Tree: Format turboderp 2025-06-15 19:30:38 +02:00
  • 1c9891bf04 Exl3: Add vision capability turboderp 2025-06-15 19:22:51 +02:00
  • 4605c0f6bd Common: Refactor get_image to common functions turboderp 2025-06-15 19:20:36 +02:00
  • d357f100d0 Dependencies: Bump ExllamaV3 turboderp 2025-06-15 19:12:45 +02:00
  • a0c16bba2a Exl2: Fix banned_strings (move outside of assign_gen_params) turboderp 2025-06-15 16:51:42 +02:00
  • 2096c9bad2 Model: Default max_seq_len to 4096 kingbri 2025-06-13 14:12:03 -04:00
  • 322f9b773a Model: Migrate inline config to new format kingbri 2025-05-26 20:51:28 -04:00
  • a3c780ae58 API: Core: Remove load/template aliases kingbri 2025-05-25 22:15:21 -04:00
  • 0ea56382f0 Dependencies: Fix unsupported dependency error kingbri 2025-06-13 14:56:34 -04:00
  • f4ee56ba13 Update README kingbri 2025-06-13 14:55:36 -04:00
  • 691a080ac7 Dependencies: Bump ExllamaV3 and ExllamaV2 turboderp 2025-05-31 23:54:16 +02:00
  • 2d89c96879 API: Re-add BOS token stripping in template render kingbri 2025-05-24 21:11:53 -04:00
  • 10fbe043a4 API: Fix typing for chat templates in CC requests kingbri 2025-05-24 21:06:05 -04:00
  • 0c4cc1eba3 Model: Add prompt logging to ExllamaV3 kingbri 2025-05-17 21:39:41 -04:00
  • 729caaeddc
    Merge pull request #346 from gakada/main Brian 2025-05-17 22:05:15 -04:00
  • 0646d358a2 Main: Log auth and sampler overrides after model load kingbri 2025-05-17 18:07:29 -04:00
  • 54b8a20a19 API: Fix types for chat completions kingbri 2025-05-17 18:04:39 -04:00
  • ba6248eec0
    Exl3: fix add_bos in generator gakada 2025-05-17 19:10:49 +09:00
  • 81170eee00
    Merge pull request #312 from davidallada/add-file-based-logging Brian 2025-05-17 01:24:19 -04:00
  • 17f3dca6fc Packaging: Add agnostic method to check version of packages kingbri 2025-05-17 01:04:24 -04:00
  • 084916c04f Model: Fix autosplit reserve crash with GPU split kingbri 2025-05-17 00:51:14 -04:00
  • 0858b6d4b2 Tree: Format kingbri 2025-05-17 00:46:40 -04:00
  • fa534fe551 Dependencies: Update Ruff kingbri 2025-05-17 00:46:25 -04:00
  • 390daeb92f Model: Create universal HFModel class kingbri 2025-05-13 18:12:38 -04:00
  • 7900b72848 API: Add chat_template_kwargs alias for template_vars kingbri 2025-05-12 15:33:34 -04:00
  • c9dc0b2aa4 Dependencies: Bump ExllamaV3 and ExllamaV2 kingbri 2025-05-12 13:30:38 -04:00
  • bd3fec929c Tree: Format kingbri 2025-05-12 11:32:27 -04:00
  • a524ac3c0f Model: Fix cache mode again kingbri 2025-05-12 11:30:47 -04:00
  • 20cad851e9 Model: Fix param call kingbri 2025-05-12 09:52:28 -04:00
  • d15eb55f20 Model: Fix exl2 cache mode check kingbri 2025-05-12 09:47:49 -04:00
  • 8996dc7b02 API: Add default for backend in model load request kingbri 2025-05-12 09:39:34 -04:00
  • b555eeb6e7
    Merge pull request #339 from Maaaxiii/fix/tool-calling-embeddings Brian 2025-05-11 20:41:58 -04:00
  • f4adca1f3e API: Remove default fallback from backend param kingbri 2025-05-11 09:56:53 -04:00
  • 3674d7b9b5
    Merge pull request #341 from theroyallab/exl3 Brian 2025-05-10 23:43:02 -04:00
  • 6379081dd8 Sampling: Make add_bos_token override concise kingbri 2025-05-10 19:07:35 -04:00
  • 656af41b5d Model: Always enable decode_special_tokens kingbri 2025-05-09 22:25:50 -04:00
  • 83826b56be Main: Remove unnecessary import kingbri 2025-05-09 22:14:11 -04:00
  • 42346c6b39 Sampling: Remove skip_special_tokens kingbri 2025-05-09 22:11:05 -04:00
  • 25c77ebf77 Model: Remove exllamav2-specific version check kingbri 2025-05-09 22:08:15 -04:00
  • 48ea1737cf Startup: Check agnostically for inference deps kingbri 2025-05-09 21:59:00 -04:00
  • 33ac016023 Dependencies: Add ExllamaV3 kingbri 2025-05-09 21:42:07 -04:00
  • f26ca23f1a
    Merge pull request #336 from DocShotgun/backend-detect Brian 2025-05-09 01:56:44 -04:00
  • 02a8d68e17
    Merge branch 'exl3' into backend-detect Brian 2025-05-08 23:50:33 -04:00
  • d5963007f0 Model: Add backend print kingbri 2025-05-08 23:45:04 -04:00
  • cfee16905b Model: Migrate backend detection to a separate function kingbri 2025-05-08 23:42:39 -04:00
  • 527afc206b
    Merge pull request #329 from DocShotgun/exl3 Brian 2025-05-08 23:11:45 -04:00
  • 638eef401a Model: Move cache creation to a common function kingbri 2025-05-08 23:10:03 -04:00
  • 22f7f1e1ec fix: flipped parameter name with variable name Maximilian Klem 2025-05-07 21:04:30 +02:00