It's useful for the client to know what the T/s and total time for generation are per-request. Works with both completions and chat completions. Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com> |
||
|---|---|---|
| .. | ||
| types | ||
| utils | ||
| router.py | ||
It's useful for the client to know what the T/s and total time for generation are per-request. Works with both completions and chat completions. Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com> |
||
|---|---|---|
| .. | ||
| types | ||
| utils | ||
| router.py | ||