A quantization vector derived from a donor model via weight-space arithmetic can be added to a receiver model to improve post-PTQ Top-1 accuracy by up to 60 points in 3-bit settings without receiver-side QAT or data.
Cage: Curvature-aware gradient estimation for accurate quantization-aware training.arXiv preprint arXiv:2510.18784
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 3verdicts
UNVERDICTED 3roles
background 1polarities
background 1representative citing papers
WinQ accelerates quantization-aware training up to 4x and improves sub-4-bit accuracy up to 8.8% by weight interpolation resets and noise-regularized gradients that increase Hessian eigenvalue magnitudes around saddle points.
Tiny NeRV models using capacity scaling, frequency-aware distillation, and low-precision quantization achieve favorable quality-efficiency trade-offs with far fewer parameters and lower computational costs than standard NeRV.
citing papers explorer
-
Zero-Shot Quantization via Weight-Space Arithmetic
A quantization vector derived from a donor model via weight-space arithmetic can be added to a receiver model to improve post-PTQ Top-1 accuracy by up to 60 points in 3-bit settings without receiver-side QAT or data.
-
WinQ: Accelerating Quantization-Aware Training of Language Models Around Saddle Points
WinQ accelerates quantization-aware training up to 4x and improves sub-4-bit accuracy up to 8.8% by weight interpolation resets and noise-regularized gradients that increase Hessian eigenvalue magnitudes around saddle points.
-
TinyNeRV: Compact Neural Video Representations via Capacity Scaling, Distillation, and Low-Precision Inference
Tiny NeRV models using capacity scaling, frequency-aware distillation, and low-precision quantization achieve favorable quality-efficiency trade-offs with far fewer parameters and lower computational costs than standard NeRV.