A kinematic-to-visual lifting paradigm combined with hierarchically routed control generates action-conditioned surgical videos with better faithfulness, fidelity, and efficiency.
Rig3r: Rig- aware conditioning for learned 3d reconstruction
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CV 3years
2026 3verdicts
UNVERDICTED 3roles
background 1polarities
background 1representative citing papers
KITScenes LongTail supplies multimodal driving data and multilingual expert reasoning traces to benchmark models on rare scenarios beyond basic safety metrics.
LA-Pose achieves over 10% higher pose accuracy than recent feed-forward methods on Waymo and PandaSet benchmarks by repurposing latent actions from self-supervised inverse-dynamics pretraining while using orders of magnitude less labeled 3D data.
citing papers explorer
-
From Articulated Kinematics to Routed Visual Control for Action-Conditioned Surgical Video Generation
A kinematic-to-visual lifting paradigm combined with hierarchically routed control generates action-conditioned surgical videos with better faithfulness, fidelity, and efficiency.
-
LongTail Driving Scenarios with Reasoning Traces: The KITScenes LongTail Dataset
KITScenes LongTail supplies multimodal driving data and multilingual expert reasoning traces to benchmark models on rare scenarios beyond basic safety metrics.
-
LA-Pose: Latent Action Pretraining Meets Pose Estimation
LA-Pose achieves over 10% higher pose accuracy than recent feed-forward methods on Waymo and PandaSet benchmarks by repurposing latent actions from self-supervised inverse-dynamics pretraining while using orders of magnitude less labeled 3D data.