arXiv preprint arXiv:2505.17813 , year =

Michael Hassid, Gabriel Synnaeve, Yossi Adi, Roy Schwartz , title = · 2025 · arXiv 2505.17813

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

read on arXiv browse 9 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

State commitment learning: training language models to distinguish computation from memory

cs.LG · 2026-05-22 · unverdicted · novelty 7.0

Introduces state commitment learning and Counterfactual Erasure RL (CERL) to train models to commit only persistent state, reducing answer dependence on hidden thoughts across math, logic, QA, and tool-use tasks without accuracy loss.

Uncovering the Representation Geometry of Minimal Cores in Overcomplete Reasoning Traces

cs.AI · 2026-05-14 · unverdicted · novelty 7.0

Language models produce overcomplete reasoning traces where on average 46% of steps can be removed while preserving the answer in 86% of cases, with necessity concentrated in the top three steps.

Deep Reasoning in General Purpose Agents via Structured Meta-Cognition

cs.CL · 2026-05-12 · unverdicted · novelty 7.0

DOLORES, an agent using a formal language for meta-reasoning to construct adaptive scaffolds on the fly, outperforms prior scaffolding methods by 24.8% on average across four hard benchmarks and multiple model sizes.

CLORE: Content-Level Optimization for Reasoning Efficiency

cs.AI · 2026-05-21 · unverdicted · novelty 6.0

CLORE augments correct on-policy rollouts by deleting repetitive and irrelevant segments then optimizes with auxiliary DPO to improve accuracy-efficiency trade-off on math benchmarks.

VSPO: Vector-Steered Policy Optimization for Behavioral Control

cs.LG · 2026-05-15 · unverdicted · novelty 6.0

VSPO samples rollouts at varying steering intensities to improve behavioral control in LLMs while preserving task accuracy.

HypEHR: Hyperbolic Modeling of Electronic Health Records for Efficient Question Answering

cs.AI · 2026-04-22 · unverdicted · novelty 6.0

HypEHR is a hyperbolic embedding model for EHR data that uses Lorentzian geometry and hierarchy-aware pretraining to answer clinical questions nearly as well as large language models but with much smaller size.

Procedural Knowledge at Scale Improves Reasoning

cs.CL · 2026-04-01 · unverdicted · novelty 6.0

Reasoning Memory decomposes reasoning trajectories into 32 million subquestion-subroutine pairs and retrieves them via in-thought prompts to improve language model performance on math, science, and coding benchmarks by up to 19.2%.

Evaluating Advanced Prompting on Gemini Flash for Multi-Hop Biomedical QA

cs.IR · 2026-05-05 · unverdicted · novelty 2.0

Sophisticated prompting on Gemini 2.0 Flash achieves a 0.720 Concept Level Score on MedHopQA, outperforming baseline by 0.155 and matching Gemini 2.5 Flash performance.

Cut Your Losses! Learning to Prune Paths Early for Efficient Parallel Reasoning

cs.CL · 2026-04-17

citing papers explorer

Showing 2 of 2 citing papers after filters.

State commitment learning: training language models to distinguish computation from memory cs.LG · 2026-05-22 · unverdicted · none · ref 4
Introduces state commitment learning and Counterfactual Erasure RL (CERL) to train models to commit only persistent state, reducing answer dependence on hidden thoughts across math, logic, QA, and tool-use tasks without accuracy loss.
VSPO: Vector-Steered Policy Optimization for Behavioral Control cs.LG · 2026-05-15 · unverdicted · none · ref 8
VSPO samples rollouts at varying steering intensities to improve behavioral control in LLMs while preserving task accuracy.

arXiv preprint arXiv:2505.17813 , year =

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer