Sampling Strategies for Robust Universal Quadrupedal Locomotion Policies
read the original abstract
This work focuses on sampling strategies of configuration variations for generating robust universal locomotion policies for quadrupedal robots. We investigate the effects of sampling physical robot parameters and joint proportional-derivative gains to enable training a single reinforcement learning policy that generalizes to multiple parameter configurations. Three fundamental joint gain sampling strategies are compared: parameter sampling with (1) linear and polynomial function mappings of mass-to-gains, (2) performance-based adaptive filtering, and (3) uniform random sampling. We improve the robustness of the policy by biasing the configurations using nominal priors and reference models. All training was conducted using the RaiSim simulation environment, tested in simulation on a range of diverse quadrupeds, and zero-shot deployed onto hardware using the ANYmal quadruped robot. Compared to multiple baseline implementations, our results demonstrate the need for significant joint controller gains randomization for robust closing of the sim-to-real gap.
This paper has not been read by Pith yet.
Forward citations
Cited by 1 Pith paper
-
Learning Perceptive Platform Adaptive Locomotion Controllers for Quadrupedal Robots
Empirical comparison of blind, critic-perceptive, and fully perceptive variants of morphology-aware RL locomotion controllers shows critic-only perception improves robustness over blind baselines while remaining more ...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.