HALO trains an orchestrator policy on verifier-approved refinement trajectories across 11 PDDL domains, matching GPT-5-mini success rates at roughly 45x lower orchestration cost and cutting LLM calls by 40-50%.
arXiv preprint arXiv:2503.18809 , year=
2 Pith papers cite this work. Polarity classification is still indexing.
years
2026 2verdicts
UNVERDICTED 2representative citing papers
RunAgent improves LLM reliability on structured plans by deriving constraints on the fly, using an agentic language with control flow, and dynamically selecting reasoning modes, outperforming baselines on Natural-plan and SciBench.
citing papers explorer
-
Training the Orchestrator: A Supervised Approach to End-to-End PDDL Planning with LLM Agents
HALO trains an orchestrator policy on verifier-approved refinement trajectories across 11 PDDL domains, matching GPT-5-mini success rates at roughly 45x lower orchestration cost and cutting LLM calls by 40-50%.
-
RunAgent: Interpreting Natural-Language Plans with Constraint-Guided Execution
RunAgent improves LLM reliability on structured plans by deriving constraints on the fly, using an agentic language with control flow, and dynamically selecting reasoning modes, outperforming baselines on Natural-plan and SciBench.