Labimus is the first benchmark for humanoid dexterous manipulation in organic chemistry laboratories, exposing a gap between task completion and required experimental precision.
hub Canonical reference
Humanoid- bench: Simulated humanoid benchmark for whole-body locomotion and manipulation.arXiv preprint arXiv:2403.10506
Canonical reference. 71% of citing Pith papers cite this work as background.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
HumanoidArena is a new benchmark of 7 leg-critical HOI/HSI tasks that evaluates egocentric hierarchical whole-body policies in humanoids and finds performance is strongly conditioned on the low-level GMT used.
Real-IKEA supplies 1,079 physically accurate articulated asset configurations from real IKEA parts together with resistance-calibrated simulation parameters that enable RL policies to discover robust hooking and levering behaviors.
SoftGAC defines a stochastic bridge from base to action latent that converts the MaxEnt objective into a tractable relative-entropy term reducible to control energy, achieving competitive returns with one-pass sampling.
BiCoord is a new benchmark for long-horizon tightly coordinated bimanual manipulation that includes quantitative metrics and shows existing policies like DP, RDT, Pi0 and OpenVLA-OFT struggle on such tasks.
SPHERE applies a Parseval penalty to MoE policies in continual RL to maintain spectral plasticity, yielding 133% and 50% higher average success on MetaWorld and HumanoidBench versus unregularized MoE baselines.
A discrete diffusion model tokenizes multimodal robotic data and uses a progress token to predict future states and task completion for scalable policy evaluation.
FlashSAC improves training speed and final performance of off-policy RL on high-dimensional robot tasks by reducing update frequency, increasing model scale, and bounding norms to limit critic error accumulation.
MINTO sets bootstrapped targets to the minimum of online and target network estimates, yielding faster stable value learning across online/offline RL and discrete/continuous actions.
WOLF-VLA creates a dataset of optimal-control humanoid trajectories and trains a VLA model to generate locomotion policies from natural language instructions, with planned open release of data and tools.
DeepInsight introduces a unified evaluation infrastructure for the full Physical AI stack using three invariant abstractions to enable cross-layer diagnostics on one runtime.
Marope applies hierarchical MARL with decentralized lower-level rope policies and a centralized scheduler to achieve cooperative long rope skipping on Unitree G1 humanoids in simulation and reality.
Non-uniform replay helps most when replay volume is low; high-entropy sampling remains important, and a truncated geometric distribution delivers better sample efficiency with negligible overhead.
ALAS disentangles environment and self-state streams via bio-inspired modules to deliver 23% higher subtask success and 29% better execution efficiency on long-horizon HSI tasks.
PRISM-WM uses a context-aware MoE with latent orthogonalization to model hybrid dynamics and reduce rollout drift for model-based planning.
The paper introduces World Action Models as a new paradigm unifying predictive world modeling with action generation in embodied foundation models and provides a taxonomy of existing approaches.
A literature survey summarizing modeling, state estimation, control methods, applications, and open challenges for legged robots operating in non-inertial environments where the ground moves or accelerates.
citing papers explorer
-
Labimus: A Simulation and Benchmark for Humanoid Dexterous Manipulation in Chemical Laboratory
Labimus is the first benchmark for humanoid dexterous manipulation in organic chemistry laboratories, exposing a gap between task completion and required experimental precision.
-
HumanoidArena: Benchmarking Egocentric Hierarchical Whole-body Learning
HumanoidArena is a new benchmark of 7 leg-critical HOI/HSI tasks that evaluates egocentric hierarchical whole-body policies in humanoids and finds performance is strongly conditioned on the low-level GMT used.
-
Real-IKEA: Physical Fidelity is the Prerequisite for Robust Manipulation
Real-IKEA supplies 1,079 physically accurate articulated asset configurations from real IKEA parts together with resistance-calibrated simulation parameters that enable RL policies to discover robust hooking and levering behaviors.
-
Generative Actor-Critic with Soft Bridge Policies
SoftGAC defines a stochastic bridge from base to action latent that converts the MaxEnt objective into a tractable relative-entropy term reducible to control energy, achieving competitive returns with one-pass sampling.
-
BiCoord: A Bimanual Manipulation Benchmark towards Long-Horizon Spatial-Temporal Coordination
BiCoord is a new benchmark for long-horizon tightly coordinated bimanual manipulation that includes quantitative metrics and shows existing policies like DP, RDT, Pi0 and OpenVLA-OFT struggle on such tasks.
-
SPHERE: Mitigating the Loss of Spectral Plasticity in Mixture-of-Experts for Deep Reinforcement Learning
SPHERE applies a Parseval penalty to MoE policies in continual RL to maintain spectral plasticity, yielding 133% and 50% higher average success on MetaWorld and HumanoidBench versus unregularized MoE baselines.
-
dWorldEval: Scalable Robotic Policy Evaluation via Discrete Diffusion World Model
A discrete diffusion model tokenizes multimodal robotic data and uses a progress token to predict future states and task completion for scalable policy evaluation.
-
FlashSAC: Fast and Stable Off-Policy Reinforcement Learning for High-Dimensional Robot Control
FlashSAC improves training speed and final performance of off-policy RL on high-dimensional robot tasks by reducing update frequency, increasing model scale, and bounding norms to limit critic error accumulation.
-
Use the Online Network If You Can: Towards Fast and Stable Reinforcement Learning
MINTO sets bootstrapped targets to the minimum of online and target network estimates, yielding faster stable value learning across online/offline RL and discrete/continuous actions.
-
WOLF-VLA: Whole-Body Humanoid Optimal Locomotion Framework for Vision-Language-Action Learning
WOLF-VLA creates a dataset of optimal-control humanoid trajectories and trains a VLA model to generate locomotion policies from natural language instructions, with planned open release of data and tools.
-
DeepInsight: A Unified Evaluation Infrastructure Across the Physical AI Stack
DeepInsight introduces a unified evaluation infrastructure for the full Physical AI stack using three invariant abstractions to enable cross-layer diagnostics on one runtime.
-
Cooperative Long Rope Skipping via Multi-Agent Reinforcement Learning
Marope applies hierarchical MARL with decentralized lower-level rope policies and a centralized scheduler to achieve cooperative long rope skipping on Unitree G1 humanoids in simulation and reality.
-
When Does Non-Uniform Replay Matter in Reinforcement Learning?
Non-uniform replay helps most when replay volume is low; high-entropy sampling remains important, and a truncated geometric distribution delivers better sample efficiency with negligible overhead.
-
ALAS: Adaptive Long-Horizon Action Synthesis via Async-pathway Stream Disentanglement
ALAS disentangles environment and self-state streams via bio-inspired modules to deliver 23% higher subtask success and 29% better execution efficiency on long-horizon HSI tasks.
-
Prismatic World Model: Learning Compositional Dynamics for Planning in Hybrid Systems
PRISM-WM uses a context-aware MoE with latent orthogonalization to model hybrid dynamics and reduce rollout drift for model-based planning.
-
World Action Models: The Next Frontier in Embodied AI
The paper introduces World Action Models as a new paradigm unifying predictive world modeling with action generation in embodied foundation models and provides a taxonomy of existing approaches.
-
A Survey of Legged Robotics in Non-Inertial Environments: Past, Present, and Future
A literature survey summarizing modeling, state estimation, control methods, applications, and open challenges for legged robots operating in non-inertial environments where the ground moves or accelerates.
- Learn Weightlessness: Imitate Non-Self-Stabilizing Motions on Humanoid Robot