Cage: Curvature-aware gradient estimation for accurate quantization-aware training.arXiv preprint arXiv:2510.18784

Tabesh, S · 2025 · arXiv 2510.18784

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Zero-Shot Quantization via Weight-Space Arithmetic

cs.CV · 2026-04-03 · unverdicted · novelty 8.0

A quantization vector derived from a donor model via weight-space arithmetic can be added to a receiver model to improve post-PTQ Top-1 accuracy by up to 60 points in 3-bit settings without receiver-side QAT or data.

WinQ: Accelerating Quantization-Aware Training of Language Models Around Saddle Points

cs.LG · 2026-05-17 · unverdicted · novelty 6.0

WinQ accelerates quantization-aware training up to 4x and improves sub-4-bit accuracy up to 8.8% by weight interpolation resets and noise-regularized gradients that increase Hessian eigenvalue magnitudes around saddle points.

TinyNeRV: Compact Neural Video Representations via Capacity Scaling, Distillation, and Low-Precision Inference

cs.CV · 2026-04-10 · unverdicted · novelty 4.0

Tiny NeRV models using capacity scaling, frequency-aware distillation, and low-precision quantization achieve favorable quality-efficiency trade-offs with far fewer parameters and lower computational costs than standard NeRV.

citing papers explorer

Showing 3 of 3 citing papers.

Zero-Shot Quantization via Weight-Space Arithmetic cs.CV · 2026-04-03 · unverdicted · none · ref 7
A quantization vector derived from a donor model via weight-space arithmetic can be added to a receiver model to improve post-PTQ Top-1 accuracy by up to 60 points in 3-bit settings without receiver-side QAT or data.
WinQ: Accelerating Quantization-Aware Training of Language Models Around Saddle Points cs.LG · 2026-05-17 · unverdicted · none · ref 32
WinQ accelerates quantization-aware training up to 4x and improves sub-4-bit accuracy up to 8.8% by weight interpolation resets and noise-regularized gradients that increase Hessian eigenvalue magnitudes around saddle points.
TinyNeRV: Compact Neural Video Representations via Capacity Scaling, Distillation, and Low-Precision Inference cs.CV · 2026-04-10 · unverdicted · none · ref 67
Tiny NeRV models using capacity scaling, frequency-aware distillation, and low-precision quantization achieve favorable quality-efficiency trade-offs with far fewer parameters and lower computational costs than standard NeRV.

Cage: Curvature-aware gradient estimation for accurate quantization-aware training.arXiv preprint arXiv:2510.18784

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer