pith. sign in

hub

Reasoning models can be effective without thinking

22 Pith papers cite this work. Polarity classification is still indexing.

22 Pith papers citing it

hub tools

citation-role summary

background 3

citation-polarity summary

years

2026 16 2025 6

roles

background 3

polarities

background 3

representative citing papers

ThoughtFold: Folding Reasoning Chains via Introspective Preference Learning

cs.AI · 2026-06-02 · unverdicted · novelty 6.0

ThoughtFold applies introspective redundancy detection within correct CoT trajectories to create sub-trajectory spectra, then uses masked preference optimization to penalize redundant explorations, yielding 56% token reduction on DeepSeek-R1-Distill-Qwen-7B while preserving accuracy.

Step-GRPO: Internalizing Dynamic Early Exit for Efficient Reasoning

cs.AI · 2026-04-18 · unverdicted · novelty 6.0

Step-GRPO internalizes dynamic early exit into reasoning models via step-structured optimization, Dynamic Truncated Rollout, and Step-Aware Relative Reward, delivering 32% token reduction on Qwen3-8B with no accuracy loss.

Thinking Sparks!: Emergent Attention Heads in Reasoning Models During Post Training

cs.AI · 2025-09-30 · unverdicted · novelty 6.0

Post-training on reasoning tasks sparks the emergence of specialized attention heads that enable structured computation, with SFT adding stable heads while GRPO uses dynamic activation and pruning tied to reward signals, and controllable think models relying on compensatory heads instead of specific

Efficient Test-Time Scaling via Temporal Reasoning Aggregation

cs.AI · 2026-04-19 · unverdicted · novelty 5.0

TRACE aggregates answer consistency and confidence trajectory over multiple reasoning steps to decide when to halt inference, reducing token usage by 25-30% while keeping accuracy within 1-2% of full reasoning.

Self-Aligned Reward: Towards Effective and Efficient Reasoners

cs.LG · 2025-09-05 · unverdicted · novelty 5.0

Self-aligned reward uses relative perplexity differences to encourage concise, query-specific reasoning in LLMs, yielding 4% accuracy gains and 30% lower inference cost when added to PPO or GRPO.

citing papers explorer

Showing 22 of 22 citing papers.