Spot: Se (3) pose trajectory diffusion for object-centric manipulation

· 2024 · arXiv 2411.00965

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

read on arXiv browse 6 citing papers

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

Multimodal Diffusion Forcing for Forceful Manipulation

cs.RO · 2025-11-06 · unverdicted · novelty 7.0

Multimodal Diffusion Forcing trains a diffusion model on partially masked multimodal robot trajectories to learn temporal and cross-modal dependencies for forceful manipulation.

AFFORD2ACT: Affordance-Guided Automatic Keypoint Selection for Generalizable and Lightweight Robotic Manipulation

cs.RO · 2025-10-01 · unverdicted · novelty 6.0

AFFORD2ACT distills a minimal set of affordance-guided 2D keypoints from text and a single image to train a 38-dimensional gated transformer policy that achieves 82% success on unseen objects and scenes.

Robotic Manipulation by Imitating Generated Videos Without Physical Demonstrations

cs.RO · 2025-07-01 · unverdicted · novelty 6.0

RIGVid shows that filtered AI-generated videos can serve as effective supervision for complex robotic manipulation tasks without any real demonstrations.

UniVLA: Learning to Act Anywhere with Task-centric Latent Actions

cs.RO · 2025-05-09 · unverdicted · novelty 6.0

UniVLA trains cross-embodiment vision-language-action policies from unlabeled videos via a latent action model in DINO space, beating OpenVLA on benchmarks with 1/20th pretraining compute and 1/10th downstream data.

HumanEgo: Zero-Shot Robot Learning from Minutes of Human Egocentric Videos

cs.RO · 2026-05-24 · unverdicted · novelty 5.0

HumanEgo reports 92.5% average success on four real robot tasks using only 15-30 minutes of human video per task and zero robot data, with zero-shot transfer to new robots and cameras.

FUNCanon: Learning Pose-Aware Action Primitives via Functional Object Canonicalization for Generalizable Robotic Manipulation

cs.RO · 2025-09-23 · unverdicted · novelty 5.0

FunCanon introduces functional object canonicalization with VLM affordances to create pose-aware action primitives for generalizable imitation learning in robotic manipulation.

citing papers explorer

Showing 6 of 6 citing papers.

Multimodal Diffusion Forcing for Forceful Manipulation cs.RO · 2025-11-06 · unverdicted · none · ref 16
Multimodal Diffusion Forcing trains a diffusion model on partially masked multimodal robot trajectories to learn temporal and cross-modal dependencies for forceful manipulation.
AFFORD2ACT: Affordance-Guided Automatic Keypoint Selection for Generalizable and Lightweight Robotic Manipulation cs.RO · 2025-10-01 · unverdicted · none · ref 22
AFFORD2ACT distills a minimal set of affordance-guided 2D keypoints from text and a single image to train a 38-dimensional gated transformer policy that achieves 82% success on unseen objects and scenes.
Robotic Manipulation by Imitating Generated Videos Without Physical Demonstrations cs.RO · 2025-07-01 · unverdicted · none · ref 46
RIGVid shows that filtered AI-generated videos can serve as effective supervision for complex robotic manipulation tasks without any real demonstrations.
UniVLA: Learning to Act Anywhere with Task-centric Latent Actions cs.RO · 2025-05-09 · unverdicted · none · ref 32
UniVLA trains cross-embodiment vision-language-action policies from unlabeled videos via a latent action model in DINO space, beating OpenVLA on benchmarks with 1/20th pretraining compute and 1/10th downstream data.
HumanEgo: Zero-Shot Robot Learning from Minutes of Human Egocentric Videos cs.RO · 2026-05-24 · unverdicted · none · ref 48
HumanEgo reports 92.5% average success on four real robot tasks using only 15-30 minutes of human video per task and zero robot data, with zero-shot transfer to new robots and cameras.
FUNCanon: Learning Pose-Aware Action Primitives via Functional Object Canonicalization for Generalizable Robotic Manipulation cs.RO · 2025-09-23 · unverdicted · none · ref 4
FunCanon introduces functional object canonicalization with VLM affordances to create pose-aware action primitives for generalizable imitation learning in robotic manipulation.

Spot: Se (3) pose trajectory diffusion for object-centric manipulation

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer