SOAR improves NVFP4 post-training quantization accuracy for LLMs by analytically solving joint scale optimization and searching decoupled scales.
Benchmarking post-training quantization of large language models under microscaling floating point formats
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
QuantClaw dynamically routes precision in agent workflows to cut cost by up to 21.4% and latency by 15.7% while keeping or improving task performance.
citing papers explorer
-
SOAR: Scale Optimization for Accurate Reconstruction in NVFP4 Quantization
SOAR improves NVFP4 post-training quantization accuracy for LLMs by analytically solving joint scale optimization and searching decoupled scales.
-
QuantClaw: Precision Where It Matters for OpenClaw
QuantClaw dynamically routes precision in agent workflows to cut cost by up to 21.4% and latency by 15.7% while keeping or improving task performance.