jalr/tabbyAPI-ollama

Author	SHA1	Message	Date
DocShotgun	67507105d0	Update colab, expose additional args * Exposed draft model args for speculative decoding * Exposed int8 cache, dummy models, and no flash attention * Resolved CUDA 11.8 dependency issue	2023-12-04 22:20:46 -08:00
veryamazinglystupid	ad1a12a0f2	make colab better, fix libcudart errors :3	2023-12-03 14:07:52 +05:30
DocShotgun	2a9e4ca051	Add Colab example *note: this uses wheels for python 3.10 and torch 2.1.0+cu118 which is the current default in colab	2023-12-03 02:21:51 -05:00