A walk in the park: Learning to walk in 20 minutes with model-free reinforcement learning

Laura Smith, Ilya Kostrikov, Sergey Levine · 2022 · arXiv 2208.07860

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

Neuromorphic Reinforcement Learning for Quadruped Locomotion Control on Uneven Terrain

cs.NE · 2026-05-10 · unverdicted · novelty 7.0

An equilibrium-propagation-based PPO controller for a 12-DoF quadruped achieves locomotion performance comparable to backpropagation-trained PPO on uneven terrain while using 4.3 times less GPU memory.

When Does Non-Uniform Replay Matter in Reinforcement Learning?

cs.LG · 2026-05-11 · unverdicted · novelty 5.0 · 2 refs

Non-uniform replay improves RL sample efficiency mainly in low replay-volume regimes, with high-entropy sampling being key even at comparable recency.

citing papers explorer

Showing 2 of 2 citing papers.

Neuromorphic Reinforcement Learning for Quadruped Locomotion Control on Uneven Terrain cs.NE · 2026-05-10 · unverdicted · none · ref 37
An equilibrium-propagation-based PPO controller for a 12-DoF quadruped achieves locomotion performance comparable to backpropagation-trained PPO on uneven terrain while using 4.3 times less GPU memory.
When Does Non-Uniform Replay Matter in Reinforcement Learning? cs.LG · 2026-05-11 · unverdicted · none · ref 35 · 2 links
Non-uniform replay improves RL sample efficiency mainly in low replay-volume regimes, with high-entropy sampling being key even at comparable recency.

A walk in the park: Learning to walk in 20 minutes with model-free reinforcement learning

fields

years

verdicts

representative citing papers

citing papers explorer