See2Act couples action denoising with viewpoint refinement in a diffusion-based imitation learning policy trained on keyframe-anchored camera poses, recovering informative views under occlusion and improving RLBench performance by up to 34% with zero-shot sim-to-real transfer.
Equivariant diffusion policy
7 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.RO 7verdicts
UNVERDICTED 7roles
background 1polarities
background 1representative citing papers
FAFM performs flow matching in the frequency domain using DCT on action sequences to produce continuous temporally consistent robotic actions with a Sobolev-style smoothness regularizer.
EquiVLA is the first general framework for end-to-end SO(2)-equivariant VLA models using EquiPerceptor and EquiActor modules, reporting improved success rates on LIBERO, CALVIN, and real-robot benchmarks.
GLAM learns a shared latent action space grounded in consistent future observation prediction across heterogeneous data sources to train improved behavioral cloning policies for robot manipulation tasks.
SID achieves approximately 90% success on six real-world manipulation tasks with only two demonstrations under out-of-distribution initializations, with less than 10% performance drop under distractors and disturbances.
R2RGen introduces a simulator-free three-stage pipeline that parses, augments, and post-processes real pointcloud observation-action pairs to improve spatial generalization in robotic manipulation policies.
PACTS jointly model action trajectories and predicate belief trajectories in a single generative policy, enabling zero-shot skill composition via symbolic planning without retraining.
citing papers explorer
-
Learning to See While Learning to Act: Diffusion Models for Active Perception in Robot Imitation
See2Act couples action denoising with viewpoint refinement in a diffusion-based imitation learning policy trained on keyframe-anchored camera poses, recovering informative views under occlusion and improving RLBench performance by up to 34% with zero-shot sim-to-real transfer.
-
Frequency-Aware Flow Matching for Continuous and Consistent Robotic Action Generation
FAFM performs flow matching in the frequency domain using DCT on action sequences to produce continuous temporally consistent robotic actions with a Sobolev-style smoothness regularizer.
-
EquiVLA: A General Framework for Rotationally Equivariant Vision-Language-Action Models
EquiVLA is the first general framework for end-to-end SO(2)-equivariant VLA models using EquiPerceptor and EquiActor modules, reporting improved success rates on LIBERO, CALVIN, and real-robot benchmarks.
-
Imitation from Heterogeneous Demonstrations using Grounded Latent-Action World Models
GLAM learns a shared latent action space grounded in consistent future observation prediction across heterogeneous data sources to train improved behavioral cloning policies for robot manipulation tasks.
-
SID: Sliding into Distribution for Robust Few-Demonstration Manipulation
SID achieves approximately 90% success on six real-world manipulation tasks with only two demonstrations under out-of-distribution initializations, with less than 10% performance drop under distractors and disturbances.
-
R2RGEN: Real-to-Real 3D Data Generation for Spatially Generalized Manipulation
R2RGen introduces a simulator-free three-stage pipeline that parses, augments, and post-processes real pointcloud observation-action pairs to improve spatial generalization in robotic manipulation policies.
-
Jointly Learning Predicates and Actions Enables Zero-Shot Skill Composition
PACTS jointly model action trajectories and predicate belief trajectories in a single generative policy, enabling zero-shot skill composition via symbolic planning without retraining.