TwinQuant learns quantization-friendly subspaces for 4-bit LLM weights via manifold optimization and a fused kernel, preserving near-FP16 accuracy with up to 1.8x speedup on LLaMA3 and Qwen3 models.
Efficient riemannian optimization on the stiefel manifold via the cayley transform
5 Pith papers cite this work. Polarity classification is still indexing.
5
Pith papers citing it
citation-role summary
background 1
method 1
citation-polarity summary
representative citing papers
SpinQuant learns optimal rotations to enable accurate 4-bit quantization of LLM weights, activations, and KV cache, reducing the zero-shot gap to full precision to 2.9 points on LLaMA-2 7B.
A quantum solver for PDEs is introduced via flexible matrix product operator representations with mid-circuit measurements and state-dependent norm correction to handle non-unitary dynamics.
Pion is an optimizer that preserves the singular values of weight matrices in LLM training by applying orthogonal equivalence transformations.
citing papers explorer
No citing papers match the current filters.