Minimal overparametrization makes exact certification from examples exponentially hard for depth-2 threshold circuits and log-precision Transformers.
Frontier LLMs still struggle with simple reasoning tasks
5 Pith papers cite this work. Polarity classification is still indexing.
years
2026 5representative citing papers
A nine-dimension algebraic complexity framework shows that LLMs suffer a scale-invariant working memory bottleneck, collapsing at 20-30 parallel branches regardless of parameter count from 8B to 235B.
DOLORES, an agent using a formal language for meta-reasoning to construct adaptive scaffolds on the fly, outperforms prior scaffolding methods by 24.8% on average across four hard benchmarks and multiple model sizes.
Unembedding collapse in transformers prevents distinguishing unseen tokens in symbolic reasoning, but targeted interventions restore generalization.
Task-evoked brain signals enhance LLM reasoning performance via representation steering at inference and fine-tuning, yielding up to 13 percent accuracy gains orthogonal to language supervision.
citing papers explorer
-
Certification from Examples is Hard for Circuits and Transformers under Minimal Overparametrization
Minimal overparametrization makes exact certification from examples exponentially hard for depth-2 threshold circuits and log-precision Transformers.
-
Beyond Accuracy: Diagnosing Algebraic Reasoning Failures in LLMs Across Nine Complexity Dimensions
A nine-dimension algebraic complexity framework shows that LLMs suffer a scale-invariant working memory bottleneck, collapsing at 20-30 parallel branches regardless of parameter count from 8B to 235B.
-
Deep Reasoning in General Purpose Agents via Structured Meta-Cognition
DOLORES, an agent using a formal language for meta-reasoning to construct adaptive scaffolds on the fly, outperforms prior scaffolding methods by 24.8% on average across four hard benchmarks and multiple model sizes.
-
To See the Unseen: on the Generalization Ability of Transformers in Symbolic Reasoning
Unembedding collapse in transformers prevents distinguishing unseen tokens in symbolic reasoning, but targeted interventions restore generalization.
-
Beyond representational alignment with brain-guided language models for robust reasoning
Task-evoked brain signals enhance LLM reasoning performance via representation steering at inference and fine-tuning, yielding up to 13 percent accuracy gains orthogonal to language supervision.