pith. sign in

Quantization hurts reasoning? an empirical study on quantized reasoning models.arXiv preprint arXiv:2504.04823

12 Pith papers cite this work. Polarity classification is still indexing.

12 Pith papers citing it

citation-role summary

background 3

citation-polarity summary

years

2026 11 2025 1

roles

background 3

polarities

background 3

representative citing papers

Search Your Block Floating Point Scales!

cs.LG · 2026-05-12 · unverdicted · novelty 6.0

ScaleSearch optimizes block floating point scales via fine-grained search to cut quantization error by 27% for NVFP4, improving PTQ by up to 15 points on MATH500 for Qwen3-8B and attention PPL by 0.77 on Llama 3.1 70B.

QuantClaw: Precision Where It Matters for OpenClaw

cs.AI · 2026-04-24 · unverdicted · novelty 6.0

QuantClaw dynamically routes precision in agent workflows to cut cost by up to 21.4% and latency by 15.7% while keeping or improving task performance.

Cassandra: Enabling Reasoning LLMs at Edge via Self-Speculative Decoding

cs.AR · 2026-05-26 · unverdicted · novelty 5.0

Cassandra is a self-speculative decoding system that builds a draft model via fine-grained data selection and optimized pruning/mantissa truncation, achieving up to 2.41x speedup over BF16 and 1.81x more tokens than Eagle-3 on Llama 3 8B without training.

citing papers explorer

Showing 12 of 12 citing papers.