Tapir: Tracking any point with per-frame initialization and temporal refinement

· 2023 · arXiv 2306.08637

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

representative citing papers

cs.RO · 2026-06-15 · unverdicted · novelty 7.0

HUG trains a flow-matching model on a new 1M-frame egocentric human grasp dataset to generate retargetable grasps from single RGB-D images, beating baselines by 23-34% on a new 90-object benchmark.

MolmoMotion: Forecasting Point Trajectories in 3D with Language Instruction

cs.CV · 2026-06-17 · unverdicted · novelty 6.0

Introduces a new task of goal-conditioned 3D point motion forecasting along with a 1.16M-video dataset, a 111-category benchmark, and a model that outperforms baselines while transferring to robotics and video generation.

Turning Video Models into Generalist Robot Policies

cs.RO · 2026-05-27 · unverdicted · novelty 6.0

Decouples action-free video world models from embodiment-specific IDMs using Jacobian-based translation to achieve zero-shot cross-embodiment robot policies.

Intuitive Surgical SurgToolLoc and SurgVU Challenges Results: 2022-2025

cs.CV · 2023-05-11 · unverdicted · novelty 2.0

The paper summarizes results from the SurgToolLoc and SurgVU challenges held at MICCAI conferences from 2022 to 2025.

citing papers explorer

Showing 3 of 3 citing papers after filters.

Human Universal Grasping cs.RO · 2026-06-15 · unverdicted · none · ref 41
HUG trains a flow-matching model on a new 1M-frame egocentric human grasp dataset to generate retargetable grasps from single RGB-D images, beating baselines by 23-34% on a new 90-object benchmark.
MolmoMotion: Forecasting Point Trajectories in 3D with Language Instruction cs.CV · 2026-06-17 · unverdicted · none · ref 21
Introduces a new task of goal-conditioned 3D point motion forecasting along with a 1.16M-video dataset, a 111-category benchmark, and a model that outperforms baselines while transferring to robotics and video generation.
Turning Video Models into Generalist Robot Policies cs.RO · 2026-05-27 · unverdicted · none · ref 43
Decouples action-free video world models from embodiment-specific IDMs using Jacobian-based translation to achieve zero-shot cross-embodiment robot policies.

Tapir: Tracking any point with per-frame initialization and temporal refinement

fields

years

verdicts

representative citing papers

citing papers explorer