Think deep, not just long: Measuring llm reasoning effort via deep-thinking tokens

Wei-Lin Chen, Liqian Peng, Tian Tan, Chao Zhao, Blake JianHang Chen, Ziqian Lin, Alec Go, Yu Meng · 2026 · arXiv 2602.13517

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

representative citing papers

Latent State Design for World Models under Sufficiency Constraints

cs.AI · 2026-05-03 · unverdicted · novelty 7.0

World models succeed when their latent states are built to meet task-specific sufficiency constraints rather than preserving the maximum amount of information.

Stateful Reasoning via Insight Replay

cs.AI · 2026-05-14 · conditional · novelty 6.0

InsightReplay improves LLM accuracy on reasoning benchmarks by extracting and replaying critical insights to maintain their accessibility during extended chain-of-thought generation.

Spatiotemporal Hidden-State Dynamics as a Signature of Internal Reasoning in Large Language Models

cs.CL · 2026-05-03 · unverdicted · novelty 6.0

Large reasoning models show measurable hidden-state dynamics that a new statistic can use to distinguish correct reasoning trajectories without labels.

citing papers explorer

Showing 3 of 3 citing papers.

Latent State Design for World Models under Sufficiency Constraints cs.AI · 2026-05-03 · unverdicted · none · ref 11
World models succeed when their latent states are built to meet task-specific sufficiency constraints rather than preserving the maximum amount of information.
Stateful Reasoning via Insight Replay cs.AI · 2026-05-14 · conditional · none · ref 6
InsightReplay improves LLM accuracy on reasoning benchmarks by extracting and replaying critical insights to maintain their accessibility during extended chain-of-thought generation.
Spatiotemporal Hidden-State Dynamics as a Signature of Internal Reasoning in Large Language Models cs.CL · 2026-05-03 · unverdicted · none · ref 18
Large reasoning models show measurable hidden-state dynamics that a new statistic can use to distinguish correct reasoning trajectories without labels.

Think deep, not just long: Measuring llm reasoning effort via deep-thinking tokens

fields

years

verdicts

representative citing papers

citing papers explorer