Model: Cast autosplit_reserve to int

Torch errors if float values are passed (because bytes are not float types). Therefore, overestimate and cast to an int type. Resolves #97 Signed-off-by: kingbri <bdashore3@proton.me>
2024-04-21 23:47:37 -04:00 · 2024-04-21 23:47:37 -04:00 · 88b0b6f4f1
commit 88b0b6f4f1
parent cab789e685
1 changed files with 5 additions and 1 deletions
--- a/backends/exllamav2/model.py
+++ b/backends/exllamav2/model.py
@ -1,6 +1,7 @@
 """The model container class for ExLlamaV2 models."""

 import gc
+import math
 import pathlib
 import threading
 import time
@ -130,7 +131,10 @@ class ExllamaV2Container:

            autosplit_reserve_megabytes = unwrap(kwargs.get("autosplit_reserve"), [96])
            self.autosplit_reserve = list(
-                map(lambda value: value * 1024**2, autosplit_reserve_megabytes)
+                map(
+                    lambda value: int(math.ceil(value * 1024**2)),
+                    autosplit_reserve_megabytes,
+                )
            )
        elif gpu_count > 1:
            # Manual GPU split