Uni-Mo generates 7,488 language-annotated quadruped motions via LLM prompts and video diffusion, lifts them to 3D trajectories, and trains policies achieving 96.7% real-robot success on 392 sampled motions.
hub Mixed citations
Mujoco: A physics en- gine for model-based control, in: 2012 IEEE/RSJ International Con- ference on Intelligent Robots and Systems, IEEE
Mixed citation behavior. Most common role is background (67%).
hub tools
citation-role summary
citation-polarity summary
representative citing papers
MPC-Injection biases off-policy RL locomotion policies toward controller-induced behavior basins by injecting MPC transitions into the replay buffer.
HARBOR is a new agentic harness framework that automates robot RL workflows end-to-end across 16 tasks in manipulation, locomotion, and dexterous control, matching or exceeding default configurations while enabling sim-to-real transfer.
SceneCode compiles natural language prompts into executable code programs that generate editable, articulated indoor scenes for physics simulation.
RAT reformulates regularized natural policy gradients as vanilla gradients with a transformed advantage, computed efficiently via randomized block Kaczmarz iterations on on-policy data.
CelloCut formulates watertight remeshing as binary labeling on a Delaunay tetrahedral partition solved by graph-cut minimization with one-sided constraints to guarantee volumetrically consistent solids.
EgoFun3D creates a new task, 271-video dataset, and pipeline using function templates to model interactive 3D objects from egocentric videos for simulation.
HeiSD delivers up to 2.45x faster inference for embodied VLA models by hybridizing speculative decoding with kinematic boundary detection and error-mitigation tricks while preserving task success rates.
Distributed low-resolution time-of-flight sensors along a 53 cm continuum robot, fused with a shape prior, achieve 2.5 cm position and 7.2 degree orientation localization error in simulation and real experiments across multiple environments.
GLUE orchestrates frozen pre-trained generative models into a system-level design generator that enforces feasibility, performance, and diversity, with data-driven and data-free variants benchmarked on UAV design.
Frictional Q-Learning encodes supported actions as tangent directions on an action manifold using a contrastive variational autoencoder to reduce extrapolation errors in off-policy reinforcement learning.
BeyondMimic combines compact motion tracking with a unified guided latent diffusion model to master diverse agile behaviors from human demos and solve unseen downstream tasks via test-time classifier guidance.
LLMPhy uses iterative LLM-generated programs executed in physics engines to solve continuous parameter estimation and discrete scene layout problems, outperforming prior black-box methods on three new zero-shot physical reasoning datasets.
Releases the largest open teleoperation dataset for robot manipulation together with hardware, simulation, and training infrastructure to support scalable behavior cloning.
Hallucination in world models is a data coverage issue predictable by three signals and preventable through targeted training sampling and online data collection.
OmniContact introduces contact flow as a shared representation of body trajectories and contact signals to learn and chain loco-manipulation meta-skills, reporting 98.7% success on box carrying and 76.5% on push-stack tasks.
AutoDex automates the full perception-execution-labeling-reset loop for real-world dexterous grasping data collection, delivering 4.8x throughput over teleoperation and 76% success for retrieved grasps versus 34% from simulation-only data.
NASDAQ normalizes observations in an online RL setting so that dynamics prediction losses are balanced across dimensions, yielding competitive performance with lower wall-time than prior model-based and self-predictive methods.
The paper introduces an inductive generalization evaluation protocol for manipulation policies and shows that SOTA vision-language-action models fail on progressively harder task variants.
DO AS I DO reconstructs and retargets hand-object interactions from in-the-wild monocular RGB videos to produce dexterous robot manipulation trajectories, outperforming prior methods on ground-truth and online video datasets.
AnnotateAnything converts passive 3D assets into manipulation-ready assets by combining vision-language reasoning for semantics with parallel physics pipelines for executable action annotations such as grasps and articulations.
A post-hoc predictive safety filter adjusts RL policy contact locations for quadruped robots via sampling-based optimization on a full-physics model, reducing safety violations in cluttered environments with minimal performance deviation.
AEGIS uses activation probes for early-warning detection of high-risk steps in weak policies and selectively escalates to stronger policies, recovering 10.1% of lost trajectories on LIBERO-Spatial while activating the strong policy on only 38% of steps.
Trains embodiment-aware value functions on up to 50 robots and applies their gradients as differentiable surrogates to optimize held-out robot designs with over 1100 parameters.
citing papers explorer
-
Unleashing Infinite Motion: Scaling Expressive Quadrupedal Motion via Generative Video Priors
Uni-Mo generates 7,488 language-annotated quadruped motions via LLM prompts and video diffusion, lifts them to 3D trajectories, and trains policies achieving 96.7% real-robot success on 392 sampled motions.
-
MPC-Injection: Biasing Off-Policy Locomotion RL Toward Controller-Induced Behavior Basins
MPC-Injection biases off-policy RL locomotion policies toward controller-induced behavior basins by injecting MPC transitions into the replay buffer.
-
HARBOR: A Harness Framework for Agentic Robot Reinforcement Learning
HARBOR is a new agentic harness framework that automates robot RL workflows end-to-end across 16 tasks in manipulation, locomotion, and dexterous control, matching or exceeding default configurations while enabling sim-to-real transfer.
-
SceneCode: Executable World Programs for Editable Indoor Scenes with Articulated Objects
SceneCode compiles natural language prompts into executable code programs that generate editable, articulated indoor scenes for physics simulation.
-
Randomized Advantage Transformation (RAT): Computing Natural Policy Gradients via Direct Backpropagation
RAT reformulates regularized natural policy gradients as vanilla gradients with a transformed advantage, computed efficiently via randomized block Kaczmarz iterations on on-policy data.
-
CelloCut: Constructive Watertight Remeshing via Tetrahedral Cell Cuts
CelloCut formulates watertight remeshing as binary labeling on a Delaunay tetrahedral partition solved by graph-cut minimization with one-sided constraints to guarantee volumetrically consistent solids.
-
EgoFun3D: Modeling Interactive Objects from Egocentric Videos using Function Templates
EgoFun3D creates a new task, 271-video dataset, and pipeline using function templates to model interactive 3D objects from egocentric videos for simulation.
-
HeiSD: Hybrid Speculative Decoding for Embodied Vision-Language-Action Models with Kinematic Awareness
HeiSD delivers up to 2.45x faster inference for embodied VLA models by hybridizing speculative decoding with kinematic boundary detection and error-mitigation tricks while preserving task success rates.
-
Continuum Robot Localization using Distributed Time-of-Flight Sensors
Distributed low-resolution time-of-flight sensors along a 53 cm continuum robot, fused with a shape prior, achieve 2.5 cm position and 7.2 degree orientation localization error in simulation and real experiments across multiple environments.
-
GLUE: Coordinating Pre-Trained Generative Models for System-Level Design
GLUE orchestrates frozen pre-trained generative models into a system-level design generator that enforces feasibility, performance, and diversity, with data-driven and data-free variants benchmarked on UAV design.
-
Frictional Q-Learning
Frictional Q-Learning encodes supported actions as tangent directions on an action manifold using a contrastive variational autoencoder to reduce extrapolation errors in off-policy reinforcement learning.
-
BeyondMimic: From Motion Tracking to Versatile Humanoid Control via Guided Diffusion
BeyondMimic combines compact motion tracking with a unified guided latent diffusion model to master diverse agile behaviors from human demos and solve unseen downstream tasks via test-time classifier guidance.
-
LLMPhy: Parameter-Identifiable Physical Reasoning Combining Large Language Models and Physics Engines
LLMPhy uses iterative LLM-generated programs executed in physics engines to solve continuous parameter estimation and discrete scene layout problems, outperforming prior black-box methods on three new zero-shot physical reasoning datasets.
-
Scalable Behavior Cloning with Open Data, Training, and Evaluation
Releases the largest open teleoperation dataset for robot manipulation together with hardware, simulation, and training infrastructure to support scalable behavior cloning.
-
Hallucination in World Models is Predictable and Preventable
Hallucination in world models is a data coverage issue predictable by three signals and preventable through targeted training sampling and online data collection.
-
OmniContact: Chaining Meta-Skills via Contact Flow for Generalizable Humanoid Loco-Manipulation
OmniContact introduces contact flow as a shared representation of body trajectories and contact signals to learn and chain loco-manipulation meta-skills, reporting 98.7% success on box carrying and 76.5% on push-stack tasks.
-
AutoDex: An Automated Real-World System for Dexterous Grasping Data Collection
AutoDex automates the full perception-execution-labeling-reset loop for real-world dexterous grasping data collection, delivering 4.8x throughput over teleoperation and 76% success for retrieved grasps versus 34% from simulation-only data.
-
NASDAQ: Normalized Observation Space Dynamics-Augmented Q-Learning
NASDAQ normalizes observations in an online RL setting so that dynamics prediction losses are balanced across dimensions, yielding competitive performance with lower wall-time than prior model-based and self-predictive methods.
-
Inductive Generalization for Robotic Manipulation
The paper introduces an inductive generalization evaluation protocol for manipulation policies and shows that SOTA vision-language-action models fail on progressively harder task variants.
-
Do as I Do: Dexterous Manipulation Data from Everyday Human Videos
DO AS I DO reconstructs and retargets hand-object interactions from in-the-wild monocular RGB videos to produce dexterous robot manipulation trajectories, outperforming prior methods on ground-truth and online video datasets.
-
AnnotateAnything: Automatic Annotation of 3D Assets for Robot Manipulation
AnnotateAnything converts passive 3D assets into manipulation-ready assets by combining vision-language reasoning for semantics with parallel physics pipelines for executable action annotations such as grasps and articulations.
-
Shield-Loco: Shielding Locomotion Policies with Predictive Safety Filtering
A post-hoc predictive safety filter adjusts RL policy contact locations for quadruped robots via sampling-based optimization on a full-physics model, reducing safety violations in cluttered environments with minimal performance deviation.
-
AEGIS: A Backup Reflex for Physical AI
AEGIS uses activation probes for early-warning detection of high-risk steps in weak policies and selectively escalates to stronger policies, recovering 10.1% of lost trajectories on LIBERO-Spatial while activating the strong policy on only 38% of steps.
-
Shape Your Body: Value Gradients for Multi-Embodiment Robot Design
Trains embodiment-aware value functions on up to 50 robots and applies their gradients as differentiable surrogates to optimize held-out robot designs with over 1100 parameters.
-
Curriculum reinforcement learning with measurable task representation learning
A VAE-based latent task representation enables automatic curriculum generation in CRL for non-Euclidean navigation tasks, outperforming interpolation and GAN-based methods in experiments.
-
ARC-RL: A Reinforcement Learning Playground Inspired by ARC Raiders
ARC-RL is a new suite of four MuJoCo continuous-control environments featuring game-inspired hexapod and quadruped morphologies, a single closed-form multi-component reward function, CPG demonstrators, and empirical comparisons of online and offline-to-online RL algorithms.
-
Global Convergence of Sampling-Based Nonconvex Optimization through Diffusion-Style Smoothing
Recasts sampling-based nonconvex optimization as smoothed gradient descent to obtain non-asymptotic convergence guarantees and introduces the DIDA annealed algorithm that converges to the global optimum.
-
PhyMotion: Structured 3D Motion Reward for Physics-Grounded Human Video Generation
PhyMotion scores generated human videos by grounding recovered 3D poses in a physics simulator across kinematic, contact, and dynamic axes, yielding stronger human correlation and larger RL post-training gains than prior 2D rewards.
-
R2R2: Robust Representation for Intensive Experience Reuse via Redundancy Reduction in Self-Predictive Learning
R2R2 introduces a non-centered regularization objective for SPL that addresses conflicts with spectral properties, leading to better performance on continuous control tasks at high UTD ratios.
-
Mirror, Mirror on the Wall: Can VLM Agents Tell Who They Are at All?
Stronger VLM agents use mirror reflections for self-identification in controlled 3D tests, while weaker ones inspect but fail to extract or correctly attribute self-relevant information.
-
Lucid-XR: An Extended-Reality Data Engine for Robotic Manipulation
Lucid-XR uses XR-headset physics simulation and physics-guided video generation to create synthetic data that trains robot policies transferring zero-shot to unseen real-world manipulation tasks.
-
VADF: Vision-Adaptive Diffusion Policy Framework for Efficient Robotic Manipulation
VADF adds an Adaptive Loss Network for hard-negative training sampling and a Hierarchical Vision Task Segmenter for adaptive noise scheduling during inference to speed convergence and reduce timeouts in diffusion robotic policies.
-
Solving Physics Olympiad via Reinforcement Learning on Physics Simulators
Physics simulators generate synthetic QA data for RL training that improves LLM performance on IPhO problems by 5-10 percentage points.
-
FlashSAC: Fast and Stable Off-Policy Reinforcement Learning for High-Dimensional Robot Control
FlashSAC improves training speed and final performance of off-policy RL on high-dimensional robot tasks by reducing update frequency, increasing model scale, and bounding norms to limit critic error accumulation.
-
frax: Fast Robot Kinematics and Dynamics in JAX
frax is a new open-source JAX library delivering low-microsecond CPU dynamics and over 100 million GPU evaluations per second for robot kinematics and dynamics with autodiff support.
-
HUSKY: Humanoid Skateboarding System via Physics-Aware Whole-Body Control
HUSKY combines humanoid-skateboard dynamics modeling with adversarial motion priors and physics-guided lean-to-steer strategies to achieve real-world stable skateboarding on a humanoid robot.
-
Toward Reliable Sim-to-Real Predictability for MoE-based Robust Quadrupedal Locomotion
MoE-based locomotion policy with RoboGauge metrics achieves reliable sim-to-real transfer, enabling robust quadrupedal walking on challenging unseen terrains up to 4 m/s.
-
Neural CDEs as Correctors for Learned Time Series Models
Neural CDEs serve as correctors that reduce error accumulation in multi-step forecasts from learned time-series models across synthetic, physics, and real-world data.
-
SONIC: Supersizing Motion Tracking for Natural Humanoid Whole-Body Control
Scaling motion tracking models along size, data volume, and compute produces a foundation model for natural, robust humanoid whole-body control with downstream uses in kinematic planning and vision-language-action models.
-
Isaac Lab: A GPU-Accelerated Simulation Framework for Multi-Modal Robot Learning
Isaac Lab is a unified GPU-native platform combining high-fidelity physics, photorealistic rendering, multi-frequency sensors, domain randomization, and learning pipelines for scalable multi-modal robot policy training.
-
GraspVLA: a Grasping Foundation Model Pre-trained on Billion-scale Synthetic Action Data
GraspVLA shows that pretraining a grasping model on a billion synthetic action frames enables zero-shot open-vocabulary performance and sim-to-real transfer.
-
AWAC: Accelerating Online Reinforcement Learning with Offline Datasets
AWAC combines offline data with online RL via advantage-weighted actor-critic updates to enable faster acquisition of robotic skills such as dexterous manipulation.
-
Evolvability ES: Scalable and Direct Optimization of Evolvability
Evolvability ES is an evolutionary strategy variant that directly optimizes for evolvability by maximizing behavioral diversity under mutations, tested on 2D/3D locomotion tasks and shown competitive with MAML.
-
Any-Body Guard: Universal Safeguarding for Manipulation Policies via Action Masking
X-Safe masks actions in configuration space using forward kinematics and quasi-static object models to give probabilistic collision-avoidance guarantees that transfer across robot embodiments without per-setup engineering.
-
Objective-Behavior Alignment: Diagnostics for MORL Policy Selection
Proposes an exploratory diagnostic workflow to highlight behavioral variation along MORL Pareto fronts not captured by objective values, with validation on grid and continuous control tasks.
-
AI Sandboxes: A Threat Model, Taxonomy, and Measurement Framework
The paper presents a threat model, taxonomy, and six-dimension measurement framework for AI sandboxes to clarify valid testing claims for safety, security, and regulatory assurance.
-
OMG: Omni-Modal Motion Generation for Generalist Humanoid Control
OMG is a diffusion model for omni-modal whole-body humanoid motion generation that uses language, audio, and reference motions after large-scale data curation to achieve state-of-the-art performance and adaptation.
-
MARCH: Model-Assisted Reinforcement Learning for the Perceptive Control of Humanoids over Sparse Footholds
MARCH combines simplified-model trajectory generation with CLF-guided teacher RL and vision-policy distillation to enable stable humanoid locomotion over sparse terrain with better sample efficiency than pure model-free methods.
-
TAM: Torque Adaptation Module for Robust Motion Transfer in Manipulation
TAM is a policy-agnostic torque adaptation module trained in randomized simulation that improves zero-shot real-robot performance on dynamic manipulation tasks compared to system identification and RMA baselines.
-
Closed-Loop Sim-to-Real Reinforcement Learning for Deformable Microfiber Shape Control
A closed-loop sim-to-real RL policy trained in a simplified frictionless simulator achieves sub-millimeter microfiber shape control on physical hardware via visual feedback without retraining.