In a stochastic k-ary tree, a two-head transformer learns randomized DFS via policy gradient under depth-wise curriculum, generalizes to deeper trees, and adapts to imbalanced goals via discounting.
Lower Bounds for Chain-of-Thought Reasoning in Hard- Attention Transformers
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 3roles
method 1polarities
use method 1representative citing papers
Deriving a neural cellular automaton from locality, symmetry, and stability postulates produces 100% accurate addition generalization from 16-digit to 1-million-digit inputs.
The paper unifies perspectives on Long CoT in reasoning LLMs by introducing a taxonomy, detailing characteristics of deep reasoning and reflection, and discussing emergence phenomena and future directions.
citing papers explorer
-
Agentic Transformers Provably Learn to Search via Reinforcement Learning
In a stochastic k-ary tree, a two-head transformer learns randomized DFS via policy gradient under depth-wise curriculum, generalizes to deeper trees, and adapts to imbalanced goals via discounting.
-
On the Spatiotemporal Dynamics of Generalization in Neural Networks
Deriving a neural cellular automaton from locality, symmetry, and stability postulates produces 100% accurate addition generalization from 16-digit to 1-million-digit inputs.
-
Towards Reasoning Era: A Survey of Long Chain-of-Thought for Reasoning Large Language Models
The paper unifies perspectives on Long CoT in reasoning LLMs by introducing a taxonomy, detailing characteristics of deep reasoning and reflection, and discussing emergence phenomena and future directions.