Toward Reliable Sim-to-Real Predictability for MoE-based Robust Quadrupedal Locomotion

· 2026 · cs.RO · arXiv 2602.00678

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Reinforcement learning has shown strong promise for quadrupedal agile locomotion, even with proprioception-only sensing. In practice, however, sim-to-real gap and reward overfitting in complex terrains can produce policies that fail to transfer, while physical validation remains risky and inefficient. To address these challenges, we introduce a unified framework encompassing a Mixture-of-Experts (MoE) locomotion policy for robust multi-terrain representation with RoboGauge, a predictive assessment suite that quantifies sim-to-real transferability. The MoE policy employs a gated set of specialist experts to decompose latent terrain and command modeling, achieving superior deployment robustness and generalization via proprioception alone. RoboGauge further provides multi-dimensional proprioception-based metrics via sim-to-sim tests over terrains, difficulty levels, and domain randomizations, enabling reliable MoE policy selection without extensive physical trials. Experiments on a Unitree Go2 demonstrate robust locomotion on unseen challenging terrains, including snow, sand, stairs, slopes, and 30 cm obstacles. In dedicated high-speed tests, the robot reaches 4 m/s and exhibits an emergent narrow-width gait associated with improved stability at high velocity.

representative citing papers

Learning to Balance Motor Thermal Safety and Quadrupedal Locomotion Performance with Residual Policy

cs.RO · 2026-05-26 · unverdicted · novelty 4.0 · 2 refs

A two-stage RL framework with a thermal-aware residual policy enables a Unitree A1 quadruped to achieve over 13 minutes of stable locomotion under 3 kg payload versus 5 minutes before overheating with the nominal policy alone.

citing papers explorer

Showing 1 of 1 citing paper.

Learning to Balance Motor Thermal Safety and Quadrupedal Locomotion Performance with Residual Policy cs.RO · 2026-05-26 · unverdicted · none · ref 5 · 2 links · internal anchor
A two-stage RL framework with a thermal-aware residual policy enables a Unitree A1 quadruped to achieve over 13 minutes of stable locomotion under 3 kg payload versus 5 minutes before overheating with the nominal policy alone.

Toward Reliable Sim-to-Real Predictability for MoE-based Robust Quadrupedal Locomotion

fields

years

verdicts

representative citing papers

citing papers explorer