DOLORES, an agent using a formal language for meta-reasoning to construct adaptive scaffolds on the fly, outperforms prior scaffolding methods by 24.8% on average across four hard benchmarks and multiple model sizes.
CoRR , volume =
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
A unified bandit framework for general open multi-agent systems with global-UCB algorithms and regret bounds linear in entry uncertainty and dependent on system stability and agent patterns.
citing papers explorer
-
Deep Reasoning in General Purpose Agents via Structured Meta-Cognition
DOLORES, an agent using a formal language for meta-reasoning to construct adaptive scaffolds on the fly, outperforms prior scaffolding methods by 24.8% on average across four hard benchmarks and multiple model sizes.
-
Bandit Learning in General Open Multi-agent Systems
A unified bandit framework for general open multi-agent systems with global-UCB algorithms and regret bounds linear in entry uncertainty and dependent on system stability and agent patterns.