Tag: ml
All the articles with the tag "ml".
-
An FP8 KV cache for mini-sglang: what I learned shipping the v1
Shipping FP8 KV cache for mini-sglang: 2× capacity, up to 27% faster decode, quality intact.
All the articles with the tag "ml".
Shipping FP8 KV cache for mini-sglang: 2× capacity, up to 27% faster decode, quality intact.