Learning Whole-Body Humanoid Locomotion via Motion Generation and Motion Tracking

· 2026 · cs.RO · arXiv 2604.17335

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

open full Pith review browse 5 citing papers arXiv PDF

abstract

Whole-body humanoid locomotion is challenging due to high-dimensional control, morphological instability, and the need for real-time adaptation to various terrains using onboard perception. Directly applying reinforcement learning (RL) with reward shaping to humanoid locomotion often leads to lower-body-dominated behaviors, whereas imitation-based RL can learn more coordinated whole-body skills but is typically limited to replaying reference motions without a mechanism to adapt them online from perception for terrain-aware locomotion. To address this gap, we propose a whole-body humanoid locomotion framework that combines skills learned from reference motions with terrain-aware adaptation. We first train a diffusion model on retargeted human motions for real-time prediction of terrain-aware reference motions. Concurrently, we train a whole-body reference tracker with RL using this motion data. To improve robustness under imperfectly generated references, we further fine-tune the tracker with a frozen motion generator in a closed-loop setting. The resulting system supports directional goal-reaching control with terrain-aware whole-body adaptation, and can be deployed on a Unitree G1 humanoid robot with onboard perception and computation. The hardware experiments demonstrate successful traversal over boxes, hurdles, stairs, and mixed terrain combinations. Quantitative results further show the benefits of incorporating online motion generation and fine-tuning the motion tracker for improved generalization and robustness.

representative citing papers

VLK: Learning Humanoid Loco-Manipulation from Synthetic Interactions in Reconstructed Scenes

cs.RO · 2026-06-29 · unverdicted · novelty 6.0

Generates 48,000 synthetic VLK trajectories in 3D-reconstructed scenes to train a policy for egocentric perception-based humanoid navigation and object transport, shown on physical Unitree G1 robot.

SceneBot: Contact-Prompted General Humanoid Whole Body Tracking with Scene-Interaction

cs.RO · 2026-06-25 · unverdicted · novelty 6.0

SceneBot conditions a humanoid tracking policy on motion references and contact labels, using reconstructed scene-interaction data to unify free-space locomotion with contact-rich manipulation and terrain tasks.

Perceptive Behavior Foundation Model: Adapting Human Motion Priors to Robot-Centric Terrain

cs.RO · 2026-06-06 · unverdicted · novelty 6.0

Perceptive BFM grounds human motion priors in robot terrain perception via terrain-conformal reference synthesis and teacher-student transfer from adapted to raw-reference tracking.

T-GMP: Terrain-conditioned Generative Motion Priors for Versatile and Natural Humanoid Locomotion

cs.RO · 2026-06-05 · unverdicted · novelty 5.0

T-GMP learns a terrain-conditioned latent motion manifold via CVAE from demonstrations and integrates it into an adversarial pipeline with a foothold penalty for versatile, natural humanoid locomotion.

LadderMan: Learning Humanoid Perceptive Ladder Climbing

cs.RO · 2026-06-04 · unverdicted · novelty 5.0

A hybrid motion-tracking and imitation-reinforcement pipeline produces a depth-based visuomotor policy that lets humanoids climb varied ladders zero-shot on hardware and perform teleoperated manipulation while climbing.

citing papers explorer

Showing 5 of 5 citing papers.

VLK: Learning Humanoid Loco-Manipulation from Synthetic Interactions in Reconstructed Scenes cs.RO · 2026-06-29 · unverdicted · none · ref 31 · internal anchor
Generates 48,000 synthetic VLK trajectories in 3D-reconstructed scenes to train a policy for egocentric perception-based humanoid navigation and object transport, shown on physical Unitree G1 robot.
SceneBot: Contact-Prompted General Humanoid Whole Body Tracking with Scene-Interaction cs.RO · 2026-06-25 · unverdicted · none · ref 26 · internal anchor
SceneBot conditions a humanoid tracking policy on motion references and contact labels, using reconstructed scene-interaction data to unify free-space locomotion with contact-rich manipulation and terrain tasks.
Perceptive Behavior Foundation Model: Adapting Human Motion Priors to Robot-Centric Terrain cs.RO · 2026-06-06 · unverdicted · none · ref 30 · internal anchor
Perceptive BFM grounds human motion priors in robot terrain perception via terrain-conformal reference synthesis and teacher-student transfer from adapted to raw-reference tracking.
T-GMP: Terrain-conditioned Generative Motion Priors for Versatile and Natural Humanoid Locomotion cs.RO · 2026-06-05 · unverdicted · none · ref 44 · internal anchor
T-GMP learns a terrain-conditioned latent motion manifold via CVAE from demonstrations and integrates it into an adversarial pipeline with a foothold penalty for versatile, natural humanoid locomotion.
LadderMan: Learning Humanoid Perceptive Ladder Climbing cs.RO · 2026-06-04 · unverdicted · none · ref 31 · internal anchor
A hybrid motion-tracking and imitation-reinforcement pipeline produces a depth-based visuomotor policy that lets humanoids climb varied ladders zero-shot on hardware and perform teleoperated manipulation while climbing.

Learning Whole-Body Humanoid Locomotion via Motion Generation and Motion Tracking

fields

years

verdicts

representative citing papers

citing papers explorer