FPTQuant: Function-preserving transforms for LLM quantization

Boris van Breugel, Yelysei Bondarenko, Paul Whatmough, Markus Nagel · 2025 · arXiv 2506.04985

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

When Quantization Is Free: An int4 KV Cache That Outruns fp16 on Apple Silicon

cs.PF · 2026-05-07 · unverdicted · novelty 7.0

A single fused int4 KV cache kernel on Apple Silicon outperforms fp16 in latency with 3x memory compression and near-zero quality loss on tested models.

citing papers explorer

Showing 1 of 1 citing paper.

When Quantization Is Free: An int4 KV Cache That Outruns fp16 on Apple Silicon cs.PF · 2026-05-07 · unverdicted · none · ref 26
A single fused int4 KV cache kernel on Apple Silicon outperforms fp16 in latency with 3x memory compression and near-zero quality loss on tested models.

FPTQuant: Function-preserving transforms for LLM quantization

fields

years

verdicts

representative citing papers

citing papers explorer