Collection

TurboQuant: Google's quantization method cuts KV cache memory by 6x with no accuracy loss

4 sources
Post cover image

Sort: