PUMA detects reasoning-level semantic redundancy to enable early exit in chains of thought, achieving 26.2% average token reduction across five LRMs and five benchmarks while preserving accuracy and CoT quality.
ConCISE: Confidence-guided compression in step-by-step efficient reasoning
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CL 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
CAT uses intrinsic confidence signals in preference optimization to adapt reasoning length in LRMs, outperforming uniform compression baselines on accuracy across benchmarks.
citing papers explorer
-
Stop When Reasoning Converges: Semantic-Preserving Early Exit for Reasoning Models
PUMA detects reasoning-level semantic redundancy to enable early exit in chains of thought, achieving 26.2% average token reduction across five LRMs and five benchmarks while preserving accuracy and CoT quality.
-
CAT: Confidence-Adaptive Thinking for Efficient Reasoning of Large Reasoning Models
CAT uses intrinsic confidence signals in preference optimization to adapt reasoning length in LRMs, outperforming uniform compression baselines on accuracy across benchmarks.