arXiv preprint arXiv:2602.13517 , year=

Wei-Lin Chen, Liqian Peng, Tian Tan, Chao Zhao, Blake JianHang Chen, Ziqian Lin, Alec Go, Yu Meng · 2026 · arXiv 2602.13517

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

read on arXiv browse 9 citing papers

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

Co-Evolving Skill Generation and Policy Optimization

cs.CL · 2026-06-07 · unverdicted · novelty 7.0

Framework estimates context-dependent marginal utility of candidate skills via reward gaps in matched base vs. skill-augmented rollouts to filter skills and co-train policy as generator.

GRASP: Learning to Ground Social Reasoning in Multi-Person Non-Verbal Interactions

cs.CV · 2026-05-15 · unverdicted · novelty 7.0

GRASP is a large-scale dataset and benchmark for social reasoning grounded in gaze and gesture events in multi-person videos, with Social Grounding Reward (SGR) proposed to improve model performance on GRASP-Bench.

Latent State Design for World Models under Sufficiency Constraints

cs.AI · 2026-05-03 · unverdicted · novelty 7.0

World models succeed when their latent states are built to meet task-specific sufficiency constraints rather than preserving the maximum amount of information.

The Tutoring Effectiveness Index: Predicting LLM Math Tutor Quality from Four Conversation Signals

cs.CY · 2026-05-28 · unverdicted · novelty 6.0

The Tutoring Effectiveness Index (TEI) uses four signals from LLM conversations to select math tutoring responses, raising student improvement rates from 59.0% to 81.9% at N=8 on a frozen DeepSeek-R1-8B model without training or judges.

Stateful Reasoning via Insight Replay

cs.AI · 2026-05-14 · unverdicted · novelty 6.0 · 2 refs

InsightReplay improves long CoT reasoning by extracting critical insights from the trace and replaying them near the active frontier, delivering +1.65 average accuracy gain across 24 model-benchmark settings.

Spatiotemporal Hidden-State Dynamics as a Signature of Internal Reasoning in Large Language Models

cs.CL · 2026-05-03 · unverdicted · novelty 6.0

Large reasoning models show measurable hidden-state dynamics that a new statistic can use to distinguish correct reasoning trajectories without labels.

When Do LLMs Reason? A Dynamical Systems View via Entropy Phase Transitions

cs.LG · 2026-05-20 · unverdicted · novelty 5.0

Early entropy dynamics during LLM decoding mark when explicit reasoning becomes beneficial, enabling the training-free EDRM router that selects strategies per instance and yields 41-55% token savings with accuracy gains across 15 benchmarks.

Integrated and Cross-Architecture Interpretation of LLM Reasoning

cs.CL · 2026-05-27 · unverdicted · novelty 4.0

Proposes IAR framework using MIP token isolation, DTR overlap analysis, and Jaccard stability to interpret reasoning patterns in Qwen and Llama models across math, code, logic, and commonsense domains.

Effort as Ceiling, Not Dial: Reasoning Budget Does Not Modulate Cognitive Cost Alignment Between Humans and Large Reasoning Models

cs.CL · 2026-05-16 · unverdicted · novelty 4.0

Reasoning budget in LRMs functions as a generation ceiling rather than a real-time dial, leaving cognitive cost alignment with humans invariant across effort levels and supporting a training-time compiled account.

citing papers explorer

Showing 1 of 1 citing paper after filters.

GRASP: Learning to Ground Social Reasoning in Multi-Person Non-Verbal Interactions cs.CV · 2026-05-15 · unverdicted · none · ref 11
GRASP is a large-scale dataset and benchmark for social reasoning grounded in gaze and gesture events in multi-person videos, with Social Grounding Reward (SGR) proposed to improve model performance on GRASP-Bench.

arXiv preprint arXiv:2602.13517 , year=

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer