Recasts sampling-based nonconvex optimization as smoothed gradient descent to obtain non-asymptotic convergence guarantees and introduces the DIDA annealed algorithm that converges to the global optimum.
hub
//arxiv.org/abs/1909.06586
10 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
roles
background 2representative citing papers
DD-SRad is a new RL constraint technique that adapts per-actuator radii dynamically to achieve zero violations and unconstrained-level task performance on heterogeneous robotic joints.
Differentiable simulation enables torque-sensor-free actuator model identification from trajectory data, achieving 1.88x better position tracking than a stand-trained baseline and 46% longer travel in downstream locomotion policies.
SemLoco is a reinforcement learning system that integrates semantic understanding with foothold planning to let legged robots navigate cluttered environments without stepping on sensitive low-lying objects.
FLORES is a wheel-legged robot with front-leg hip-yaw DoFs replacing hip-roll, paired with a custom RL controller using adapted HIM and tailored rewards for smooth wheeled-to-legged transitions and efficient gaits.
Integrates iterative learning control with a torque library to enable high-precision adaptive locomotion on bipedal and quadrupedal robots, reducing tracking errors by up to 85% and achieving over 30x faster control rates.
TAG-K combines greedy randomized Kaczmarz row selection with tail averaging to deliver faster convergence and noise robustness for online inertial parameter estimation in robotics.
Sparsely gated MoE policies double the success rate of a real Unitree Go2 quadruped on large-obstacle parkour versus matched-active-parameter MLP baselines while cutting inference time compared with a scaled-up MLP.
citing papers explorer
-
Global Convergence of Sampling-Based Nonconvex Optimization through Diffusion-Style Smoothing
Recasts sampling-based nonconvex optimization as smoothed gradient descent to obtain non-asymptotic convergence guarantees and introduces the DIDA annealed algorithm that converges to the global optimum.
-
Constraint-Enhanced Reinforcement Learning Based on Dynamic Decoupled Spherical Radial Squashing
DD-SRad is a new RL constraint technique that adapts per-actuator radii dynamically to achieve zero violations and unconstrained-level task performance on heterogeneous robotic joints.
-
Trajectory-based actuator identification via differentiable simulation
Differentiable simulation enables torque-sensor-free actuator model identification from trajectory data, achieving 1.88x better position tracking than a stand-trained baseline and 46% longer travel in downstream locomotion policies.
-
Watch Your Step: Learning Semantically-Guided Locomotion in Cluttered Environment
SemLoco is a reinforcement learning system that integrates semantic understanding with foothold planning to let legged robots navigate cluttered environments without stepping on sensitive low-lying objects.
-
A Reconfigured Wheel-Legged Robot for Enhanced Steering and Adaptability
FLORES is a wheel-legged robot with front-leg hip-yaw DoFs replacing hip-roll, paired with a custom RL controller using adapted HIM and tailored rewards for smooth wheeled-to-legged transitions and efficient gaits.
-
Iteratively Learning Muscle Memory for Legged Robots to Master Adaptive and High Precision Locomotion
Integrates iterative learning control with a torque library to enable high-precision adaptive locomotion on bipedal and quadrupedal robots, reducing tracking errors by up to 85% and achieving over 30x faster control rates.
-
TAG-K: Tail-Averaged Greedy Kaczmarz for Computationally Efficient and Performant Online Inertial Parameter Estimation
TAG-K combines greedy randomized Kaczmarz row selection with tail averaging to deliver faster convergence and noise robustness for online inertial parameter estimation in robotics.
-
Quadruped Parkour Learning: Sparsely Gated Mixture of Experts with Visual Input
Sparsely gated MoE policies double the success rate of a real Unitree Go2 quadruped on large-obstacle parkour versus matched-active-parameter MLP baselines while cutting inference time compared with a scaled-up MLP.
- Neuromorphic Reinforcement Learning for Quadruped Locomotion Control on Uneven Terrain
- Right Model, Right Time: Real-Time Cascaded-Fidelity MPC for Bipedal Walking