A reinforcement-learned Conductor coordinates pools of LLMs to reach state-of-the-art results on LiveCodeBench and GPQA by discovering communication topologies and targeted prompts.
Understand the problem and the transportation options available. Determine the strategy to find the minimum time to travel from city 1 to city N
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Learning to Orchestrate Agents in Natural Language with the Conductor
A reinforcement-learned Conductor coordinates pools of LLMs to reach state-of-the-art results on LiveCodeBench and GPQA by discovering communication topologies and targeted prompts.