QuantClaw dynamically routes precision in agent workflows to cut cost by up to 21.4% and latency by 15.7% while keeping or improving task performance.
Int vs fp: A comprehensive study of fine-grained low-bit quantization formats
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
AdaHOP applies pattern-aware Hadamard transforms and selective outlier extraction to enable from-scratch MXFP4 training of LLMs at BF16 quality with up to 3.6X memory compression and 1.46X speedup.
HiFloat4 FP4 with stabilization techniques trains dense and MoE language models on Ascend NPUs at relative error within 1% of full-precision baselines.
citing papers explorer
-
QuantClaw: Precision Where It Matters for OpenClaw
QuantClaw dynamically routes precision in agent workflows to cut cost by up to 21.4% and latency by 15.7% while keeping or improving task performance.
-
AdaHOP: Fast and Accurate Low-Precision Training via Outlier-Pattern-Aware Rotation
AdaHOP applies pattern-aware Hadamard transforms and selective outlier extraction to enable from-scratch MXFP4 training of LLMs at BF16 quality with up to 3.6X memory compression and 1.46X speedup.
-
HiFloat4 Format for Language Model Pre-training on Ascend NPUs
HiFloat4 FP4 with stabilization techniques trains dense and MoE language models on Ascend NPUs at relative error within 1% of full-precision baselines.