Now You See That: Learning End-to-End Humanoid Locomotion from Raw Pixels

· 2026 · cs.RO · arXiv 2602.06382

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

open full Pith review browse 6 citing papers arXiv PDF

abstract

Achieving robust vision-based humanoid locomotion remains challenging due to two fundamental issues: the sim-to-real gap introduces significant perception noise that degrades performance on fine-grained tasks, and training a unified policy across diverse terrains is hindered by conflicting learning objectives. To address these challenges, we present an end-to-end framework for vision-driven humanoid locomotion. For robust sim-to-real transfer, we develop a high-fidelity depth sensor simulation that captures stereo matching artifacts and calibration uncertainties inherent in real-world sensing. We further propose a vision-aware behavior distillation approach that combines latent space alignment with noise-invariant auxiliary tasks, enabling effective knowledge transfer from privileged height maps to noisy depth observations. For versatile terrain adaptation, we introduce terrain-specific reward shaping integrated with multi-critic and multi-discriminator learning, where dedicated networks capture the distinct dynamics and motion priors of each terrain type. We validate our approach on two humanoid platforms equipped with different stereo depth cameras. The resulting policy demonstrates robust performance across diverse environments, seamlessly handling extreme challenges such as high platforms and wide gaps, as well as fine-grained tasks including bidirectional long-term staircase traversal.

representative citing papers

Perceptive Behavior Foundation Model: Adapting Human Motion Priors to Robot-Centric Terrain

cs.RO · 2026-06-06 · unverdicted · novelty 6.0

Perceptive BFM grounds human motion priors in robot terrain perception via terrain-conformal reference synthesis and teacher-student transfer from adapted to raw-reference tracking.

TAGA: Terrain-aware Active Gaze Learning for Generalizable Agile Humanoid Locomotion

cs.RO · 2026-06-04 · unverdicted · novelty 6.0

TAGA learns terrain-aware active gaze behaviors for humanoid robots via RL alone, enabling generalizable locomotion with 1.2m real-world gap traversal.

GuideWalk: Learning Unified Autonomous Navigation and Locomotion for Humanoid Robots across Versatile Terrains

cs.RO · 2026-06-09 · unverdicted · novelty 5.0

GuideWalk unifies traversability-aware navigation and terrain-adaptive locomotion into a single policy for humanoid robots via teacher distillation and RL refinement.

VAIC: Vision-Guided Humanoid Agile Object Interaction Control via Decoupled Commands

cs.RO · 2026-06-08 · unverdicted · novelty 5.0

VAIC distills a teacher policy into a vision-and-proprioception student policy using recurrent adaptation and decoupled commands, enabling diverse real-robot tasks like box carrying and skateboarding that outperform baselines.

SSR: Scaling Surefooted and Symmetric Humanoid Traversal to the Open World

cs.RO · 2026-05-29 · unverdicted · novelty 5.0

SSR is an end-to-end vision-based framework for humanoid traversal that learns imagined foothold guidance, equivariant latent-space symmetry augmentation, and terrain-specific multi-discriminator motion priors to enable safe locomotion on diverse real-world terrains.

TACT-ful: Multi-Channel Terrain Affordance and Compliance Training for Payload-Robust Perceptive Humanoid Locomotion

cs.RO · 2026-06-06 · unverdicted · novelty 4.0

A multi-channel terrain affordance reward combined with lower-body compliance training via virtual wrenches enables end-to-end PPO-trained humanoid policies to walk at 1 m/s on 0.2 m risers with improved payload robustness.

citing papers explorer

Showing 6 of 6 citing papers.

Perceptive Behavior Foundation Model: Adapting Human Motion Priors to Robot-Centric Terrain cs.RO · 2026-06-06 · unverdicted · none · ref 26 · internal anchor
Perceptive BFM grounds human motion priors in robot terrain perception via terrain-conformal reference synthesis and teacher-student transfer from adapted to raw-reference tracking.
TAGA: Terrain-aware Active Gaze Learning for Generalizable Agile Humanoid Locomotion cs.RO · 2026-06-04 · unverdicted · none · ref 52 · internal anchor
TAGA learns terrain-aware active gaze behaviors for humanoid robots via RL alone, enabling generalizable locomotion with 1.2m real-world gap traversal.
GuideWalk: Learning Unified Autonomous Navigation and Locomotion for Humanoid Robots across Versatile Terrains cs.RO · 2026-06-09 · unverdicted · none · ref 34 · internal anchor
GuideWalk unifies traversability-aware navigation and terrain-adaptive locomotion into a single policy for humanoid robots via teacher distillation and RL refinement.
VAIC: Vision-Guided Humanoid Agile Object Interaction Control via Decoupled Commands cs.RO · 2026-06-08 · unverdicted · none · ref 49 · internal anchor
VAIC distills a teacher policy into a vision-and-proprioception student policy using recurrent adaptation and decoupled commands, enabling diverse real-robot tasks like box carrying and skateboarding that outperform baselines.
SSR: Scaling Surefooted and Symmetric Humanoid Traversal to the Open World cs.RO · 2026-05-29 · unverdicted · none · ref 5 · internal anchor
SSR is an end-to-end vision-based framework for humanoid traversal that learns imagined foothold guidance, equivariant latent-space symmetry augmentation, and terrain-specific multi-discriminator motion priors to enable safe locomotion on diverse real-world terrains.
TACT-ful: Multi-Channel Terrain Affordance and Compliance Training for Payload-Robust Perceptive Humanoid Locomotion cs.RO · 2026-06-06 · unverdicted · none · ref 17 · internal anchor
A multi-channel terrain affordance reward combined with lower-body compliance training via virtual wrenches enables end-to-end PPO-trained humanoid policies to walk at 1 m/s on 0.2 m risers with improved payload robustness.

Now You See That: Learning End-to-End Humanoid Locomotion from Raw Pixels

fields

years

verdicts

representative citing papers

citing papers explorer