Self-supervised monocular depth estimation improves in low-texture regions by using distance transforms on jointly estimated pre-semantic contours to create more informative loss signals.
arXiv preprint arXiv:2312.01283 (2023)
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 3verdicts
UNVERDICTED 3roles
background 2polarities
background 2representative citing papers
SS3D pretrains an end-to-end feed-forward 3D estimator on filtered YouTube-8M videos via SfM self-supervision, MVS filtering, and expert distillation, delivering stronger zero-shot transfer and fine-tuning than prior self-supervised baselines.
A probabilistic graphical model called 3WM unifies 3D vision tasks into one system that performs them zero-shot by selecting different inference pathways through multimodal scene nodes.
citing papers explorer
-
Improved monocular depth prediction using distance transform over pre-semantic contours with self-supervised neural networks
Self-supervised monocular depth estimation improves in low-texture regions by using distance transforms on jointly estimated pre-semantic contours to create more informative loss signals.
-
SS3D: End2End Self-Supervised 3D from Web Videos
SS3D pretrains an end-to-end feed-forward 3D estimator on filtered YouTube-8M videos via SfM self-supervision, MVS filtering, and expert distillation, delivering stronger zero-shot transfer and fine-tuning than prior self-supervised baselines.
-
Unified 3D Scene Understanding Through Physical World Modeling
A probabilistic graphical model called 3WM unifies 3D vision tasks into one system that performs them zero-shot by selecting different inference pathways through multimodal scene nodes.