MPC-Injection biases off-policy RL locomotion policies toward controller-induced behavior basins by injecting MPC transitions into the replay buffer.
hub
Learning quadrupedal locomotion over challenging terrain
12 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
roles
background 3polarities
background 3representative citing papers
Instrumented objects boost diffusion policy success in robotic hanger insertion by 14-25 percentage points over vision-only baselines, and augmenting datasets with instrumented expert rollouts lets a vision-only student match the instrumented expert.
ARC-RL is a new suite of four MuJoCo continuous-control environments featuring game-inspired hexapod and quadruped morphologies, a single closed-form multi-component reward function, CPG demonstrators, and empirical comparisons of online and offline-to-online RL algorithms.
FlashSAC improves training speed and final performance of off-policy RL on high-dimensional robot tasks by reducing update frequency, increasing model scale, and bounding norms to limit critic error accumulation.
Integrates iterative learning control with a torque library to enable high-precision adaptive locomotion on bipedal and quadrupedal robots, reducing tracking errors by up to 85% and achieving over 30x faster control rates.
T-GMP learns a terrain-conditioned latent motion manifold via CVAE from demonstrations and integrates it into an adversarial pipeline with a foothold penalty for versatile, natural humanoid locomotion.
Integrating foot position maps into heightmaps and adding a locomotion-stability reward in an attention-based RL framework improves quadrupedal success rates on both trained and out-of-domain complex terrains.
KYON is a semi-modular wheel-legged quadruped with reconfigurable lower legs, base-mounted actuators, and bimanual manipulation, using whole-body control plus RL policy for dynamic locomotion and tasks in unstructured environments.
Empirical comparison of blind, critic-perceptive, and fully perceptive variants of morphology-aware RL locomotion controllers shows critic-only perception improves robustness over blind baselines while remaining more stable under perception noise than full perception.
A DRL locomotion controller extended from prior quadruped work enabled the Go2-W robot to complete 2.8 km of autonomous real-world navigation including mixed terrain and stairs.
SDPG is a new on-policy visual RL algorithm that estimates gradients via stochastic perturbations of rollouts, achieving faster training and lower memory use than baselines on visual MuJoCo tasks while adding new robotics benchmarks and sim-to-real results.
citing papers explorer
-
FlashSAC: Fast and Stable Off-Policy Reinforcement Learning for High-Dimensional Robot Control
FlashSAC improves training speed and final performance of off-policy RL on high-dimensional robot tasks by reducing update frequency, increasing model scale, and bounding norms to limit critic error accumulation.
-
Learning Locomotion on Complex Terrain for Quadrupedal Robots with Foot Position Maps and Stability Rewards
Integrating foot position maps into heightmaps and adding a locomotion-stability reward in an attention-based RL framework improves quadrupedal success rates on both trained and out-of-domain complex terrains.
- Neuromorphic Reinforcement Learning for Quadruped Locomotion Control on Uneven Terrain