hub Canonical reference

Humanoid- bench: Simulated humanoid benchmark for whole-body locomotion and manipulation.arXiv preprint arXiv:2403.10506

Carmelo Sferrazza, Dun-Ming Huang, Xingyu Lin, Youngwoon Lee, Pieter Abbeel · 2024 · arXiv 2403.10506

Canonical reference. 71% of citing Pith papers cite this work as background.

18 Pith papers citing it

Background 71% of classified citations

read on arXiv browse 18 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 6 dataset 1

citation-polarity summary

background 5 unclear 1 use dataset 1

representative citing papers

Labimus: A Simulation and Benchmark for Humanoid Dexterous Manipulation in Chemical Laboratory

cs.RO · 2026-06-30 · unverdicted · novelty 7.0

Labimus is the first benchmark for humanoid dexterous manipulation in organic chemistry laboratories, exposing a gap between task completion and required experimental precision.

HumanoidArena: Benchmarking Egocentric Hierarchical Whole-body Learning

cs.RO · 2026-06-16 · unverdicted · novelty 7.0

HumanoidArena is a new benchmark of 7 leg-critical HOI/HSI tasks that evaluates egocentric hierarchical whole-body policies in humanoids and finds performance is strongly conditioned on the low-level GMT used.

Real-IKEA: Physical Fidelity is the Prerequisite for Robust Manipulation

cs.RO · 2026-06-07 · unverdicted · novelty 7.0

Real-IKEA supplies 1,079 physically accurate articulated asset configurations from real IKEA parts together with resistance-calibrated simulation parameters that enable RL policies to discover robust hooking and levering behaviors.

Generative Actor-Critic with Soft Bridge Policies

cs.LG · 2026-05-09 · unverdicted · novelty 7.0

SoftGAC defines a stochastic bridge from base to action latent that converts the MaxEnt objective into a tractable relative-entropy term reducible to control energy, achieving competitive returns with one-pass sampling.

BiCoord: A Bimanual Manipulation Benchmark towards Long-Horizon Spatial-Temporal Coordination

cs.RO · 2026-04-07 · conditional · novelty 7.0

BiCoord is a new benchmark for long-horizon tightly coordinated bimanual manipulation that includes quantitative metrics and shows existing policies like DP, RDT, Pi0 and OpenVLA-OFT struggle on such tasks.

SPHERE: Mitigating the Loss of Spectral Plasticity in Mixture-of-Experts for Deep Reinforcement Learning

cs.LG · 2026-05-06 · unverdicted · novelty 6.0 · 2 refs

SPHERE applies a Parseval penalty to MoE policies in continual RL to maintain spectral plasticity, yielding 133% and 50% higher average success on MetaWorld and HumanoidBench versus unregularized MoE baselines.

dWorldEval: Scalable Robotic Policy Evaluation via Discrete Diffusion World Model

cs.RO · 2026-04-24 · unverdicted · novelty 6.0

A discrete diffusion model tokenizes multimodal robotic data and uses a progress token to predict future states and task completion for scalable policy evaluation.

FlashSAC: Fast and Stable Off-Policy Reinforcement Learning for High-Dimensional Robot Control

cs.LG · 2026-04-06 · unverdicted · novelty 6.0 · 2 refs

FlashSAC improves training speed and final performance of off-policy RL on high-dimensional robot tasks by reducing update frequency, increasing model scale, and bounding norms to limit critic error accumulation.

Use the Online Network If You Can: Towards Fast and Stable Reinforcement Learning

cs.LG · 2025-10-02 · unverdicted · novelty 6.0

MINTO sets bootstrapped targets to the minimum of online and target network estimates, yielding faster stable value learning across online/offline RL and discrete/continuous actions.

WOLF-VLA: Whole-Body Humanoid Optimal Locomotion Framework for Vision-Language-Action Learning

cs.RO · 2026-06-24 · unverdicted · novelty 5.0

WOLF-VLA creates a dataset of optimal-control humanoid trajectories and trains a VLA model to generate locomotion policies from natural language instructions, with planned open release of data and tools.

DeepInsight: A Unified Evaluation Infrastructure Across the Physical AI Stack

cs.AI · 2026-06-16 · unverdicted · novelty 5.0

DeepInsight introduces a unified evaluation infrastructure for the full Physical AI stack using three invariant abstractions to enable cross-layer diagnostics on one runtime.

Cooperative Long Rope Skipping via Multi-Agent Reinforcement Learning

cs.RO · 2026-06-06 · unverdicted · novelty 5.0

Marope applies hierarchical MARL with decentralized lower-level rope policies and a centralized scheduler to achieve cooperative long rope skipping on Unitree G1 humanoids in simulation and reality.

When Does Non-Uniform Replay Matter in Reinforcement Learning?

cs.LG · 2026-05-11 · unverdicted · novelty 5.0 · 3 refs

Non-uniform replay helps most when replay volume is low; high-entropy sampling remains important, and a truncated geometric distribution delivers better sample efficiency with negligible overhead.

ALAS: Adaptive Long-Horizon Action Synthesis via Async-pathway Stream Disentanglement

cs.RO · 2026-04-22 · unverdicted · novelty 5.0

ALAS disentangles environment and self-state streams via bio-inspired modules to deliver 23% higher subtask success and 29% better execution efficiency on long-horizon HSI tasks.

Prismatic World Model: Learning Compositional Dynamics for Planning in Hybrid Systems

cs.AI · 2025-12-09 · unverdicted · novelty 5.0

PRISM-WM uses a context-aware MoE with latent orthogonalization to model hybrid dynamics and reduce rollout drift for model-based planning.

World Action Models: The Next Frontier in Embodied AI

cs.RO · 2026-05-12 · unverdicted · novelty 4.0

The paper introduces World Action Models as a new paradigm unifying predictive world modeling with action generation in embodied foundation models and provides a taxonomy of existing approaches.

A Survey of Legged Robotics in Non-Inertial Environments: Past, Present, and Future

cs.RO · 2026-04-22 · unverdicted · novelty 2.0

A literature survey summarizing modeling, state estimation, control methods, applications, and open challenges for legged robots operating in non-inertial environments where the ground moves or accelerates.

Learn Weightlessness: Imitate Non-Self-Stabilizing Motions on Humanoid Robot

cs.RO · 2026-04-23

citing papers explorer

Showing 18 of 18 citing papers.

Labimus: A Simulation and Benchmark for Humanoid Dexterous Manipulation in Chemical Laboratory cs.RO · 2026-06-30 · unverdicted · none · ref 36
Labimus is the first benchmark for humanoid dexterous manipulation in organic chemistry laboratories, exposing a gap between task completion and required experimental precision.
HumanoidArena: Benchmarking Egocentric Hierarchical Whole-body Learning cs.RO · 2026-06-16 · unverdicted · none · ref 13
HumanoidArena is a new benchmark of 7 leg-critical HOI/HSI tasks that evaluates egocentric hierarchical whole-body policies in humanoids and finds performance is strongly conditioned on the low-level GMT used.
Real-IKEA: Physical Fidelity is the Prerequisite for Robust Manipulation cs.RO · 2026-06-07 · unverdicted · none · ref 8
Real-IKEA supplies 1,079 physically accurate articulated asset configurations from real IKEA parts together with resistance-calibrated simulation parameters that enable RL policies to discover robust hooking and levering behaviors.
Generative Actor-Critic with Soft Bridge Policies cs.LG · 2026-05-09 · unverdicted · none · ref 30
SoftGAC defines a stochastic bridge from base to action latent that converts the MaxEnt objective into a tractable relative-entropy term reducible to control energy, achieving competitive returns with one-pass sampling.
BiCoord: A Bimanual Manipulation Benchmark towards Long-Horizon Spatial-Temporal Coordination cs.RO · 2026-04-07 · conditional · none · ref 50
BiCoord is a new benchmark for long-horizon tightly coordinated bimanual manipulation that includes quantitative metrics and shows existing policies like DP, RDT, Pi0 and OpenVLA-OFT struggle on such tasks.
SPHERE: Mitigating the Loss of Spectral Plasticity in Mixture-of-Experts for Deep Reinforcement Learning cs.LG · 2026-05-06 · unverdicted · none · ref 96 · 2 links
SPHERE applies a Parseval penalty to MoE policies in continual RL to maintain spectral plasticity, yielding 133% and 50% higher average success on MetaWorld and HumanoidBench versus unregularized MoE baselines.
dWorldEval: Scalable Robotic Policy Evaluation via Discrete Diffusion World Model cs.RO · 2026-04-24 · unverdicted · none · ref 36
A discrete diffusion model tokenizes multimodal robotic data and uses a progress token to predict future states and task completion for scalable policy evaluation.
FlashSAC: Fast and Stable Off-Policy Reinforcement Learning for High-Dimensional Robot Control cs.LG · 2026-04-06 · unverdicted · none · ref 76 · 2 links
FlashSAC improves training speed and final performance of off-policy RL on high-dimensional robot tasks by reducing update frequency, increasing model scale, and bounding norms to limit critic error accumulation.
Use the Online Network If You Can: Towards Fast and Stable Reinforcement Learning cs.LG · 2025-10-02 · unverdicted · none · ref 7
MINTO sets bootstrapped targets to the minimum of online and target network estimates, yielding faster stable value learning across online/offline RL and discrete/continuous actions.
WOLF-VLA: Whole-Body Humanoid Optimal Locomotion Framework for Vision-Language-Action Learning cs.RO · 2026-06-24 · unverdicted · none · ref 26
WOLF-VLA creates a dataset of optimal-control humanoid trajectories and trains a VLA model to generate locomotion policies from natural language instructions, with planned open release of data and tools.
DeepInsight: A Unified Evaluation Infrastructure Across the Physical AI Stack cs.AI · 2026-06-16 · unverdicted · none · ref 16
DeepInsight introduces a unified evaluation infrastructure for the full Physical AI stack using three invariant abstractions to enable cross-layer diagnostics on one runtime.
Cooperative Long Rope Skipping via Multi-Agent Reinforcement Learning cs.RO · 2026-06-06 · unverdicted · none · ref 43
Marope applies hierarchical MARL with decentralized lower-level rope policies and a centralized scheduler to achieve cooperative long rope skipping on Unitree G1 humanoids in simulation and reality.
When Does Non-Uniform Replay Matter in Reinforcement Learning? cs.LG · 2026-05-11 · unverdicted · none · ref 33 · 3 links
Non-uniform replay helps most when replay volume is low; high-entropy sampling remains important, and a truncated geometric distribution delivers better sample efficiency with negligible overhead.
ALAS: Adaptive Long-Horizon Action Synthesis via Async-pathway Stream Disentanglement cs.RO · 2026-04-22 · unverdicted · none · ref 27
ALAS disentangles environment and self-state streams via bio-inspired modules to deliver 23% higher subtask success and 29% better execution efficiency on long-horizon HSI tasks.
Prismatic World Model: Learning Compositional Dynamics for Planning in Hybrid Systems cs.AI · 2025-12-09 · unverdicted · none · ref 4
PRISM-WM uses a context-aware MoE with latent orthogonalization to model hybrid dynamics and reduce rollout drift for model-based planning.
World Action Models: The Next Frontier in Embodied AI cs.RO · 2026-05-12 · unverdicted · none · ref 247
The paper introduces World Action Models as a new paradigm unifying predictive world modeling with action generation in embodied foundation models and provides a taxonomy of existing approaches.
A Survey of Legged Robotics in Non-Inertial Environments: Past, Present, and Future cs.RO · 2026-04-22 · unverdicted · none · ref 31
A literature survey summarizing modeling, state estimation, control methods, applications, and open challenges for legged robots operating in non-inertial environments where the ground moves or accelerates.
Learn Weightlessness: Imitate Non-Self-Stabilizing Motions on Humanoid Robot cs.RO · 2026-04-23 · unreviewed · ref 33

Humanoid- bench: Simulated humanoid benchmark for whole-body locomotion and manipulation.arXiv preprint arXiv:2403.10506

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer