pith. sign in

hub Canonical reference

Mim- icplay: Long-horizon imitation learning by watching hu- man play

Canonical reference. 88% of citing Pith papers cite this work as background.

18 Pith papers citing it
Background 88% of classified citations

hub tools

citation-role summary

background 8

citation-polarity summary

roles

background 8

polarities

background 7 unclear 1

representative citing papers

Referring-Aware Visuomotor Policy Learning for Closed-Loop Manipulation

cs.RO · 2026-04-07 · unverdicted · novelty 7.0

ReV is a referring-aware visuomotor policy using coupled diffusion heads for real-time trajectory replanning in robotic manipulation, trained solely via targeted perturbations to expert demonstrations and achieving higher success rates in simulated and real tasks.

MonoDuo: Using One Robot Arm to Learn Bimanual Policies

cs.RO · 2026-05-28 · unverdicted · novelty 6.0

MonoDuo generates synthetic bimanual demonstrations from single-arm teleoperation plus human collaboration to train policies achieving up to 70% zero-shot success on five manipulation tasks, with 65-70% gains from 25-shot finetuning.

Bridging the Embodiment Gap: Disentangled Cross-Embodiment Video Editing

cs.RO · 2026-05-05 · unverdicted · novelty 6.0

A dual-contrastive disentanglement method factorizes videos into independent task and embodiment latents, then uses a parameter-efficient adapter on a frozen video diffusion model to synthesize robot executions from single human demonstrations without paired data.

GazeVLA: Learning Human Intention for Robotic Manipulation

cs.RO · 2026-04-24 · unverdicted · novelty 6.0

GazeVLA pretrains on large human egocentric datasets to capture gaze-based intention, then finetunes on limited robot data with chain-of-thought reasoning to achieve better robotic manipulation performance than baselines.

Unify Robot Actions in Camera Frame

cs.RO · 2025-11-21 · conditional · novelty 6.0

CalibAll estimates camera extrinsics on existing datasets to convert robot actions into a unified camera-frame representation, enabling stronger cross-embodiment pretraining.

FLARE: Robot Learning with Implicit World Modeling

cs.RO · 2025-05-21 · unverdicted · novelty 6.0

FLARE integrates predictive latent world modeling into diffusion transformer policies for robots, delivering up to 26% gains on multitask manipulation benchmarks and enabling co-training with action-free human videos.

citing papers explorer

Showing 18 of 18 citing papers.