Dust3r: Geometric 3d vi- sion made easy

Shuzhe Wang, Vincent Leroy, Yohann Cabon, Boris Chidlovskii, Jerome Revaud · 2024

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

browse 7 citing papers

representative citing papers

Reshoot-Anything: A Self-Supervised Model for In-the-Wild Video Reshooting

cs.CV · 2026-04-23 · unverdicted · novelty 7.0

Reshoot-Anything trains a diffusion transformer on pseudo multi-view triplets created by cropping and warping monocular videos to achieve temporally consistent video reshooting with robust camera control on dynamic scenes.

AnchorSplat: Feed-Forward 3D Gaussian Splatting with 3D Geometric Priors

cs.CV · 2026-04-08 · unverdicted · novelty 7.0

AnchorSplat uses anchor-aligned 3D Gaussians guided by geometric priors for feed-forward scene reconstruction, achieving SOTA novel view synthesis on ScanNet++ with fewer primitives and better view consistency.

POMA-3D: The Point Map Way to 3D Scene Understanding

cs.CV · 2025-11-20 · unverdicted · novelty 7.0

POMA-3D learns self-supervised 3D scene representations from point maps and improves performance on geometric 3D tasks including navigation and scene retrieval.

LA-Pose: Latent Action Pretraining Meets Pose Estimation

cs.CV · 2026-04-30 · unverdicted · novelty 6.0

LA-Pose achieves over 10% higher pose accuracy than recent feed-forward methods on Waymo and PandaSet benchmarks by repurposing latent actions from self-supervised inverse-dynamics pretraining while using orders of magnitude less labeled 3D data.

CylinderDepth: Cylindrical Spatial Attention for Multi-View Consistent Self-Supervised Surround Depth Estimation

cs.CV · 2025-11-20 · unverdicted · novelty 6.0

CylinderDepth uses cylindrical spatial attention with non-learned weights to enforce cross-view consistency in self-supervised surround depth estimation.

Co-Me: Confidence-Guided Token Merging for Visual Geometric Transformers

cs.CV · 2025-11-18 · unverdicted · novelty 6.0

Co-Me distills a confidence predictor to selectively merge low-confidence tokens in visual geometric transformers, delivering up to 21.5x speedup on VGGT and 20.4x on Pi3 while preserving spatial coverage and performance.

PAOLI: Pose-free Articulated Object Learning from Sparse-view Images

cs.CV · 2025-09-04 · unverdicted · novelty 6.0

A pipeline that reconstructs articulated objects from sparse unposed images by aligning independent per-pose reconstructions via learned deformation fields and progressive static/moving part disentanglement.

citing papers explorer

Showing 7 of 7 citing papers.

Reshoot-Anything: A Self-Supervised Model for In-the-Wild Video Reshooting cs.CV · 2026-04-23 · unverdicted · none · ref 35
Reshoot-Anything trains a diffusion transformer on pseudo multi-view triplets created by cropping and warping monocular videos to achieve temporally consistent video reshooting with robust camera control on dynamic scenes.
AnchorSplat: Feed-Forward 3D Gaussian Splatting with 3D Geometric Priors cs.CV · 2026-04-08 · unverdicted · none · ref 36
AnchorSplat uses anchor-aligned 3D Gaussians guided by geometric priors for feed-forward scene reconstruction, achieving SOTA novel view synthesis on ScanNet++ with fewer primitives and better view consistency.
POMA-3D: The Point Map Way to 3D Scene Understanding cs.CV · 2025-11-20 · unverdicted · none · ref 48
POMA-3D learns self-supervised 3D scene representations from point maps and improves performance on geometric 3D tasks including navigation and scene retrieval.
LA-Pose: Latent Action Pretraining Meets Pose Estimation cs.CV · 2026-04-30 · unverdicted · none · ref 34
LA-Pose achieves over 10% higher pose accuracy than recent feed-forward methods on Waymo and PandaSet benchmarks by repurposing latent actions from self-supervised inverse-dynamics pretraining while using orders of magnitude less labeled 3D data.
CylinderDepth: Cylindrical Spatial Attention for Multi-View Consistent Self-Supervised Surround Depth Estimation cs.CV · 2025-11-20 · unverdicted · none · ref 36
CylinderDepth uses cylindrical spatial attention with non-learned weights to enforce cross-view consistency in self-supervised surround depth estimation.
Co-Me: Confidence-Guided Token Merging for Visual Geometric Transformers cs.CV · 2025-11-18 · unverdicted · none · ref 37
Co-Me distills a confidence predictor to selectively merge low-confidence tokens in visual geometric transformers, delivering up to 21.5x speedup on VGGT and 20.4x on Pi3 while preserving spatial coverage and performance.
PAOLI: Pose-free Articulated Object Learning from Sparse-view Images cs.CV · 2025-09-04 · unverdicted · none · ref 46
A pipeline that reconstructs articulated objects from sparse unposed images by aligning independent per-pose reconstructions via learned deformation fields and progressive static/moving part disentanglement.

Dust3r: Geometric 3d vi- sion made easy

fields

years

verdicts

representative citing papers

citing papers explorer