RoboRouter: Training-Free Policy Routing for Robotic Manipulation

Chenjie Yang; Hongjia Ren; Huiping Zhuang; Li Zhang; Qingyao Wu; Shiyi Wang; Wenbo Li; Yanming Shao; Yemin Wang; Yiteng Chen

arxiv: 2603.07892 · v4 · pith:3IKLL4AAnew · submitted 2026-03-09 · 💻 cs.RO

RoboRouter: Training-Free Policy Routing for Robotic Manipulation

Yiteng Chen , Zhe Cao , Hongjia Ren , Chenjie Yang , Wenbo Li , Shiyi Wang , Yemin Wang , Li Zhang

show 4 more authors

Yanming Shao Zhenjun Zhao Huiping Zhuang Qingyao Wu

This is my paper

classification 💻 cs.RO

keywords policypoliciesroborouterroutingtaskroboticacrossapproaches

0 comments

read the original abstract

Research on robotic manipulation has developed a diverse set of policy paradigms, including vision-language-action (VLA) models, vision-action (VA) policies, and code-based compositional approaches. Concrete policies typically attain high success rates on specific task distributions, but limited generalization beyond it. Rather than proposing another monolithic policy, we propose to leverage the complementary strengths of existing approaches through intelligent policy routing. We introduce RoboRouter, a training-free framework that maintains a pool of heterogeneous policies and learns to select the best-performing policy for each task through accumulated execution experience. Given a new task, RoboRouter constructs a semantic task representation, retrieves historical records of similar tasks, predicts the optimal policy choice without requiring trial-and-error, and incorporates structured feedback to refine subsequent routing decisions. Integrating a new policy into the system requires only a lightweight evaluation and does not incur training overhead. Across simulation benchmark and real-world evaluations, RoboRouter consistently outperforms individual policies, improving the average success rate by more than 3% in simulation and 13% in real-world settings, while preserving execution efficiency. Our results demonstrate that intelligent routing across heterogeneous, off-the-shelf policies provides a practical and scalable pathway toward building more capable robotic systems.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

URDF Synthesis from RGB-D Sequences via Differentiable Joint Inference and Energy-Consistent Verification
cs.CV 2026-06 unverdicted novelty 6.0

KinemaForge jointly infers part geometry, joint topology, and parameters from RGB-D sequences using a kinematic graph and differentiable dynamics, then verifies with an energy residual loss, reporting lower joint erro...
From Where Things Are to What They Are For: Benchmarking Spatial-Functional Intelligence in Multimodal LLMs
cs.CV 2026-05 unverdicted novelty 5.0

SFI-Bench shows current multimodal LLMs struggle to integrate spatial memory with functional reasoning and external knowledge in video tasks.
RouterVLA: Turning Smoke Tests into Supervision for Heterogeneous VLA Selection
cs.RO 2026-06 unverdicted novelty 4.0

RouterVLA reports that a simple probe-success rule from outcome-separated smoke tests raises held-out VLA success by 14.64pp on 34,752 LIBERO-Plus records, with learned scorers adding no further gain.