DeformGen uses dynamics-based state expansion via localized disturbances and deformation-field warping for trajectory transfer to improve policy learning on deformable manipulation benchmarks.
hub
Demogen: Synthetic demonstration genera- tion for data-efficient visuomotor policy learning
26 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
roles
background 3polarities
background 3representative citing papers
DockAnywhere lifts single demonstrations to diverse docking points via structure-preserving augmentation and point-cloud spatial editing to improve viewpoint generalization in visuomotor policies for mobile manipulation.
ReV is a referring-aware visuomotor policy using coupled diffusion heads for real-time trajectory replanning in robotic manipulation, trained solely via targeted perturbations to expert demonstrations and achieving higher success rates in simulated and real tasks.
Assistron combines pre-trained VLA models with phase-aware Bayesian shared autonomy and flow matching guidance to raise task success rates and lower human workload in manipulation benchmarks without model fine-tuning.
The paper introduces an inductive generalization evaluation protocol for manipulation policies and shows that SOTA vision-language-action models fail on progressively harder task variants.
A framework augments single fisheye demonstrations into multiple novel-view trajectories with obstacles via fisheye-adapted Gaussian Splatting and trajectory optimization, raising policy success rates in original and modified scenes.
Video2Sim2Real turns a single human video into a deployable robot manipulation skill by reconstructing a digital twin, anchoring motions to object-centric simulator configurations, and bridging sim-to-real gaps with imitation learning and residual RL.
SID achieves approximately 90% success on six real-world manipulation tasks with only two demonstrations under out-of-distribution initializations, with less than 10% performance drop under distractors and disturbances.
DMP retargeting within 3DGS scenes preserves expert motion shape and phase to create diverse yet high-fidelity demonstrations, yielding lower deviation, fewer collisions, and higher downstream policy success than planner-based synthesis on Spot manipulator tasks.
Part decomposition with generative shape models allows one-shot robot skill transfer across unfamiliar object geometries in simulation and real settings.
A text-to-simulation pipeline using LLMs and VLMs generates synthetic pHRI data to train vision-based imitation learning policies that achieve over 80% success in zero-shot sim-to-real transfer on real assistive tasks.
ExpertGen generates high-success expert policies in simulation from imperfect priors by freezing a diffusion behavior model and optimizing its initial noise via RL, then distills them for real-robot deployment.
TwinRL expands RL exploration via digital twin reconstruction and twin RL warm-up to guide real-world learning, reaching near-100% success with 20 minutes of on-robot time across four tasks.
IGen generates realistic visuomotor training data including actions and temporally coherent visuals from unstructured open-world images via 3D reconstruction and VLM reasoning.
R2RGen introduces a simulator-free three-stage pipeline that parses, augments, and post-processes real pointcloud observation-action pairs to improve spatial generalization in robotic manipulation policies.
GraspVLA shows that pretraining a grasping model on a billion synthetic action frames enables zero-shot open-vocabulary performance and sim-to-real transfer.
WorldSample generates synthetic transitions from a post-trained world model grounded in real rollouts and uses Policy-Paced Learning to improve RL policies, reporting 28% higher success rates and 59% fewer training steps on contact-rich robot tasks.
TSD applies two physics metrics to identify salient trajectory segments for dataset compression and expansion in robotic imitation learning, yielding comparable performance with 25% less data on average.
MirrorDuo augments demonstration data via reflection to improve behavior cloning and diffusion policies, enabling better performance or cross-side transfer with limited demos.
ManiSplat introduces a graph-structured disentangled 3D Gaussian framework with task-oriented alignment to reconstruct controllable dynamic scenes from monocular ego-view robotic videos.
Compositional Simulation generates scalable real-world robot training data by combining classical simulation with neural simulation in a closed-loop real-sim-real augmentation pipeline.
RESample uses exploratory sampling guided by a lightweight Coverage Function to expand VLA training data coverage, yielding 12% performance gains on LIBERO and real-world tasks with 10-20% added samples.
Framework generates force-informed sim data from one demo to train compliant visuomotor flow matching policies, showing reliable contact on real-robot block flipping and bi-manual tasks.
A survey of VLA robotics research identifies data infrastructure as the primary bottleneck and distills four open challenges in representation alignment, multimodal supervision, reasoning assessment, and scalable data generation.
citing papers explorer
-
DeformGen: Dynamics-Based Topology Augmentation for Deformable Manipulation Policy Learning
DeformGen uses dynamics-based state expansion via localized disturbances and deformation-field warping for trajectory transfer to improve policy learning on deformable manipulation benchmarks.
-
DockAnywhere: Data-Efficient Visuomotor Policy Learning for Mobile Manipulation via Novel Demonstration Generation
DockAnywhere lifts single demonstrations to diverse docking points via structure-preserving augmentation and point-cloud spatial editing to improve viewpoint generalization in visuomotor policies for mobile manipulation.
-
Referring-Aware Visuomotor Policy Learning for Closed-Loop Manipulation
ReV is a referring-aware visuomotor policy using coupled diffusion heads for real-time trajectory replanning in robotic manipulation, trained solely via targeted perturbations to expert demonstrations and achieving higher success rates in simulated and real tasks.
-
Assistron: Bayesian Shared Autonomy with Off-the-shelf Vision-Language-Action Models
Assistron combines pre-trained VLA models with phase-aware Bayesian shared autonomy and flow matching guidance to raise task success rates and lower human workload in manipulation benchmarks without model fine-tuning.
-
Inductive Generalization for Robotic Manipulation
The paper introduces an inductive generalization evaluation protocol for manipulation policies and shows that SOTA vision-language-action models fail on progressively harder task variants.
-
One Demo is Worth a Thousand Trajectories: Action-View Augmentation for Visuomotor Policies
A framework augments single fisheye demonstrations into multiple novel-view trajectories with obstacles via fisheye-adapted Gaussian Splatting and trajectory optimization, raising policy success rates in original and modified scenes.
-
Video2Sim2Real: Full-Stack Autonomous Dexterous Skill Acquisition from a Single Human Video
Video2Sim2Real turns a single human video into a deployable robot manipulation skill by reconstructing a digital twin, anchoring motions to object-centric simulator configurations, and bridging sim-to-real gaps with imitation learning and residual RL.
-
SID: Sliding into Distribution for Robust Few-Demonstration Manipulation
SID achieves approximately 90% success on six real-world manipulation tasks with only two demonstrations under out-of-distribution initializations, with less than 10% performance drop under distractors and disturbances.
-
A Principled Approach for Creating High-fidelity Synthetic Demonstrations for Imitation Learning
DMP retargeting within 3DGS scenes preserves expert motion shape and phase to create diverse yet high-fidelity demonstrations, yielding lower deviation, fewer collisions, and higher downstream policy success than planner-based synthesis on Spot manipulator tasks.
-
One-Shot Cross-Geometry Skill Transfer through Part Decomposition
Part decomposition with generative shape models allows one-shot robot skill transfer across unfamiliar object geometries in simulation and real settings.
-
Generative Simulation for Policy Learning in Physical Human-Robot Interaction
A text-to-simulation pipeline using LLMs and VLMs generates synthetic pHRI data to train vision-based imitation learning policies that achieve over 80% success in zero-shot sim-to-real transfer on real assistive tasks.
-
TwinRL: Digital Twin-Driven Reinforcement Learning for Real-World Robotic Manipulation
TwinRL expands RL exploration via digital twin reconstruction and twin RL warm-up to guide real-world learning, reaching near-100% success with 20 minutes of on-robot time across four tasks.
-
IGen: Scalable Data Generation for Robot Learning from Open-World Images
IGen generates realistic visuomotor training data including actions and temporally coherent visuals from unstructured open-world images via 3D reconstruction and VLM reasoning.
-
R2RGEN: Real-to-Real 3D Data Generation for Spatially Generalized Manipulation
R2RGen introduces a simulator-free three-stage pipeline that parses, augments, and post-processes real pointcloud observation-action pairs to improve spatial generalization in robotic manipulation policies.
-
GraspVLA: a Grasping Foundation Model Pre-trained on Billion-scale Synthetic Action Data
GraspVLA shows that pretraining a grasping model on a billion synthetic action frames enables zero-shot open-vocabulary performance and sim-to-real transfer.
-
WorldSample: Closed-loop Real-robot RL with World Modelling
WorldSample generates synthetic transitions from a post-trained world model grounded in real rollouts and uses Policy-Paced Learning to improve RL policies, reporting 28% higher success rates and 59% fewer training steps on contact-rich robot tasks.
-
TSD: A Physics-Inspired Trajectory Saliency Detector for Efficient Imitation Learning
TSD applies two physics metrics to identify salient trajectory segments for dataset compression and expansion in robotic imitation learning, yielding comparable performance with 25% less data on average.
-
MirrorDuo: Reflection-Consistent Visuomotor Learning from Mirrored Demonstration Pairs
MirrorDuo augments demonstration data via reflection to improve behavior cloning and diffusion policies, enabling better performance or cross-side transfer with limited demos.
-
ManiSplat: Manipulation Trajectory Synthesis from Monocular Video via Decoupled 3D Gaussian Splatting
ManiSplat introduces a graph-structured disentangled 3D Gaussian framework with task-oriented alignment to reconstruct controllable dynamic scenes from monocular ego-view robotic videos.
-
ComSim: Building Scalable Real-World Robot Data Generation via Compositional Simulation
Compositional Simulation generates scalable real-world robot training data by combining classical simulation with neural simulation in a closed-loop real-sim-real augmentation pipeline.
-
RESample: A Robust Data Augmentation Framework via Exploratory Sampling for Robotic Manipulation
RESample uses exploratory sampling guided by a lightweight Coverage Function to expand VLA training data coverage, yielding 12% performance gains on LIBERO and real-world tasks with 10-20% added samples.
-
Flow with the Force Field: Learning 3D Compliant Flow Matching Policies from Force and Demonstration-Guided Simulation Data
Framework generates force-informed sim data from one demo to train compliant visuomotor flow matching policies, showing reliable contact on real-robot block flipping and bi-manual tasks.
-
Vision-Language-Action in Robotics: A Survey of Datasets, Benchmarks, and Data Engines
A survey of VLA robotics research identifies data infrastructure as the primary bottleneck and distills four open challenges in representation alignment, multimodal supervision, reasoning assessment, and scalable data generation.
-
3D Generation for Embodied AI and Robotic Simulation: A Survey
The paper surveys 3D generation techniques for embodied AI and robotics, categorizing them into data generation, simulation environments, and sim-to-real bridging while identifying bottlenecks in physical validity and transfer.