DOLORES, an agent using a formal language for meta-reasoning to construct adaptive scaffolds on the fly, outperforms prior scaffolding methods by 24.8% on average across four hard benchmarks and multiple model sizes.
Cognition , volume=
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
Large language models encode relational bindings via a cell-based representation: a low-dimensional linear subspace in which each cell corresponds to an entity-relation index pair and attributes are retrieved from the matching cell.
Outcome-level RL with binary or composite rewards improves compositional generalization over supervised fine-tuning by avoiding overfitting to frequent training patterns.
citing papers explorer
-
Deep Reasoning in General Purpose Agents via Structured Meta-Cognition
DOLORES, an agent using a formal language for meta-reasoning to construct adaptive scaffolds on the fly, outperforms prior scaffolding methods by 24.8% on average across four hard benchmarks and multiple model sizes.
-
Cell-Based Representation of Relational Binding in Language Models
Large language models encode relational bindings via a cell-based representation: a low-dimensional linear subspace in which each cell corresponds to an entity-relation index pair and attributes are retrieved from the matching cell.
-
Reinforcement Learning for Compositional Generalization with Outcome-Level Optimization
Outcome-level RL with binary or composite rewards improves compositional generalization over supervised fine-tuning by avoiding overfitting to frequent training patterns.