An equilibrium-propagation-based PPO controller for a 12-DoF quadruped achieves locomotion performance comparable to backpropagation-trained PPO on uneven terrain while using 4.3 times less GPU memory.
hub
MuJoCo: A physics engine for model-based control
13 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
roles
background 1polarities
unclear 1representative citing papers
EgoFun3D creates a new task, 271-video dataset, and pipeline using function templates to model interactive 3D objects from egocentric videos for simulation.
Stronger VLM agents use mirror reflections for self-identification in controlled 3D tests, while weaker ones inspect but fail to extract or correctly attribute self-relevant information.
Lucid-XR uses XR-headset physics simulation and physics-guided video generation to create synthetic data that trains robot policies transferring zero-shot to unseen real-world manipulation tasks.
VADF adds an Adaptive Loss Network for hard-negative training sampling and a Hierarchical Vision Task Segmenter for adaptive noise scheduling during inference to speed convergence and reduce timeouts in diffusion robotic policies.
Physics simulators generate synthetic QA data for RL training that improves LLM performance on IPhO problems by 5-10 percentage points.
FlashSAC scales up Soft Actor-Critic with fewer updates, larger models, higher data throughput, and norm bounds to deliver faster, more stable training than PPO on high-dimensional robot control tasks across dozens of simulators.
Isaac Lab is a unified GPU-native platform combining high-fidelity physics, photorealistic rendering, multi-frequency sensors, domain randomization, and learning pipelines for scalable multi-modal robot policy training.
AWAC combines offline data with online RL via advantage-weighted actor-critic updates to enable faster acquisition of robotic skills such as dexterous manipulation.
HPA dynamically selects agent decision orders in Stackelberg games to improve equilibria and performance in multi-agent MuJoCo control tasks.
GMP selectively activates and represents memory via a gate and lightweight cross-attention, yielding 30.1% higher success on non-Markovian robotic tasks while staying competitive on Markovian ones.
Compositional Simulation generates scalable real-world robot training data by combining classical simulation with neural simulation in a closed-loop real-sim-real augmentation pipeline.
Gymnasium establishes a standardized API for RL environments to improve interoperability, reproducibility, and ease of development in reinforcement learning.
citing papers explorer
-
Neuromorphic Reinforcement Learning for Quadruped Locomotion Control on Uneven Terrain
An equilibrium-propagation-based PPO controller for a 12-DoF quadruped achieves locomotion performance comparable to backpropagation-trained PPO on uneven terrain while using 4.3 times less GPU memory.
-
EgoFun3D: Modeling Interactive Objects from Egocentric Videos using Function Templates
EgoFun3D creates a new task, 271-video dataset, and pipeline using function templates to model interactive 3D objects from egocentric videos for simulation.
-
Mirror, Mirror on the Wall: Can VLM Agents Tell Who They Are at All?
Stronger VLM agents use mirror reflections for self-identification in controlled 3D tests, while weaker ones inspect but fail to extract or correctly attribute self-relevant information.
-
Lucid-XR: An Extended-Reality Data Engine for Robotic Manipulation
Lucid-XR uses XR-headset physics simulation and physics-guided video generation to create synthetic data that trains robot policies transferring zero-shot to unseen real-world manipulation tasks.
-
VADF: Vision-Adaptive Diffusion Policy Framework for Efficient Robotic Manipulation
VADF adds an Adaptive Loss Network for hard-negative training sampling and a Hierarchical Vision Task Segmenter for adaptive noise scheduling during inference to speed convergence and reduce timeouts in diffusion robotic policies.
-
Solving Physics Olympiad via Reinforcement Learning on Physics Simulators
Physics simulators generate synthetic QA data for RL training that improves LLM performance on IPhO problems by 5-10 percentage points.
-
FlashSAC: Fast and Stable Off-Policy Reinforcement Learning for High-Dimensional Robot Control
FlashSAC scales up Soft Actor-Critic with fewer updates, larger models, higher data throughput, and norm bounds to deliver faster, more stable training than PPO on high-dimensional robot control tasks across dozens of simulators.
-
Isaac Lab: A GPU-Accelerated Simulation Framework for Multi-Modal Robot Learning
Isaac Lab is a unified GPU-native platform combining high-fidelity physics, photorealistic rendering, multi-frequency sensors, domain randomization, and learning pipelines for scalable multi-modal robot policy training.
-
AWAC: Accelerating Online Reinforcement Learning with Offline Datasets
AWAC combines offline data with online RL via advantage-weighted actor-critic updates to enable faster acquisition of robotic skills such as dexterous manipulation.
-
Rethinking Priority Scheduling for Sequential Multi-Agent Decision Making in Stackelberg Games
HPA dynamically selects agent decision orders in Stackelberg games to improve equilibria and performance in multi-agent MuJoCo control tasks.
-
Gated Memory Policy
GMP selectively activates and represents memory via a gate and lightweight cross-attention, yielding 30.1% higher success on non-Markovian robotic tasks while staying competitive on Markovian ones.
-
ComSim: Building Scalable Real-World Robot Data Generation via Compositional Simulation
Compositional Simulation generates scalable real-world robot training data by combining classical simulation with neural simulation in a closed-loop real-sim-real augmentation pipeline.
-
Gymnasium: A Standard Interface for Reinforcement Learning Environments
Gymnasium establishes a standardized API for RL environments to improve interoperability, reproducibility, and ease of development in reinforcement learning.