Benchmarking post-training quantization of large language models under microscaling floating point formats

Zhang, M · 2026 · arXiv 2601.09555

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

SOAR: Scale Optimization for Accurate Reconstruction in NVFP4 Quantization

cs.LG · 2026-05-12 · unverdicted · novelty 6.0

SOAR improves NVFP4 post-training quantization accuracy for LLMs by analytically solving joint scale optimization and searching decoupled scales.

QuantClaw: Precision Where It Matters for OpenClaw

cs.AI · 2026-04-24 · unverdicted · novelty 6.0

QuantClaw dynamically routes precision in agent workflows to cut cost by up to 21.4% and latency by 15.7% while keeping or improving task performance.

citing papers explorer

Showing 2 of 2 citing papers.

SOAR: Scale Optimization for Accurate Reconstruction in NVFP4 Quantization cs.LG · 2026-05-12 · unverdicted · none · ref 47
SOAR improves NVFP4 post-training quantization accuracy for LLMs by analytically solving joint scale optimization and searching decoupled scales.
QuantClaw: Precision Where It Matters for OpenClaw cs.AI · 2026-04-24 · unverdicted · none · ref 31
QuantClaw dynamically routes precision in agent workflows to cut cost by up to 21.4% and latency by 15.7% while keeping or improving task performance.

Benchmarking post-training quantization of large language models under microscaling floating point formats

fields

years

verdicts

representative citing papers

citing papers explorer