Thinking isn’t an illusion: Overcoming the limitations of reasoning models via tool augmentations

Zhao Song, Song Yue, Jiahao Zhang · 2025 · arXiv 2507.17699

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

OPT-Engine: Benchmarking the Limits of LLMs in Optimization Modeling via Complexity Scaling

cs.CL · 2026-01-09 · accept · novelty 7.0

OPT-Engine shows pure-text chain-of-thought reasoning in LLMs loses robustness as optimization complexity grows, external tools fix only local arithmetic, and solver-integrated methods are bottlenecked by automated constraint formulation.

REPOT: Recoverable Program-of-Thought via Checkpoint Repair

cs.SE · 2026-05-28 · unverdicted · novelty 6.0

RePoT recovers from PoT failures via deterministic verified replay and checkpoint repair, yielding +3 to +11pp gains on planning benchmarks and showing checkpoint state as the key recovery signal over error-only feedback.

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

cs.AI · 2025-09-02 · accept · novelty 6.0

Survey that defines agentic RL for LLMs via POMDPs, introduces a taxonomy of planning/tool-use/memory/reasoning capabilities and domains, and compiles open environments from over 500 papers.

citing papers explorer

Showing 3 of 3 citing papers.

OPT-Engine: Benchmarking the Limits of LLMs in Optimization Modeling via Complexity Scaling cs.CL · 2026-01-09 · accept · none · ref 31
OPT-Engine shows pure-text chain-of-thought reasoning in LLMs loses robustness as optimization complexity grows, external tools fix only local arithmetic, and solver-integrated methods are bottlenecked by automated constraint formulation.
REPOT: Recoverable Program-of-Thought via Checkpoint Repair cs.SE · 2026-05-28 · unverdicted · none · ref 19
RePoT recovers from PoT failures via deterministic verified replay and checkpoint repair, yielding +3 to +11pp gains on planning benchmarks and showing checkpoint state as the key recovery signal over error-only feedback.
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey cs.AI · 2025-09-02 · accept · none · ref 117
Survey that defines agentic RL for LLMs via POMDPs, introduces a taxonomy of planning/tool-use/memory/reasoning capabilities and domains, and compiles open environments from over 500 papers.

Thinking isn’t an illusion: Overcoming the limitations of reasoning models via tool augmentations

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer