arxiv: 2605.01852 · v1 · submitted 2026-05-03 · 💻 cs.CV

Recognition: unknown

DP-SfM: Dual-Pixel Structure-from-Motion without Scale Ambiguity

Fumio Okura, Hiroaki Santo, Kohei Ashida, Lilika Makabe, Yasuyuki Matsushita

Authors on Pith no claims yet

Pith reviewed 2026-05-10 14:54 UTC · model grok-4.3

classification 💻 cs.CV

keywords dual-pixelstructure-from-motionscale ambiguitydefocus blurabsolute scalemulti-view reconstructionSfM3D reconstruction

0 comments

The pith

Dual-pixel sensor images resolve the unknown scale in multi-view 3D reconstruction without reference objects or calibration.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Multi-view structure-from-motion produces 3D models whose overall size remains unknown unless a reference object of known size appears in the scene. This paper shows that images from a dual-pixel sensor carry defocus blur between the left and right pixel views, and that this blur supplies the missing metric information when combined with the up-to-scale depth maps already produced by standard SfM. A linear estimator first recovers the absolute scale from the blur-depth pairing; an intensity-based optimization then refines alignment by shifting the left and right images according to cross-view blur kernels. A reader would care because the approach removes the need for any external scale reference or pre-calibration, turning ordinary multi-view captures into metric 3D models.

Core claim

The paper establishes that multi-view images captured using a dual-pixel sensor can automatically resolve the scale ambiguity in structure-from-motion. The defocus blur observed in DP images provides sufficient information to determine the absolute scale when paired with depth maps recovered from multi-view 3D reconstruction. The authors present a simple linear method to estimate this absolute scale, followed by an intensity-based optimization stage that aligns the left and right DP images by shifting them back toward each other using cross-view blur kernels.

What carries the argument

The linear scale estimator that pairs defocus blur measurements from dual-pixel left-right images with up-to-scale depth maps from SfM, followed by cross-view blur-kernel alignment optimization.

Load-bearing premise

The defocus blur in dual-pixel images encodes reliable absolute-scale information once combined with the up-to-scale depths produced by standard structure-from-motion.

What would settle it

A controlled scene containing a measured object or baseline distance in which the scale recovered by the linear estimator and optimization deviates from the known physical ground truth.

Figures

Figures reproduced from arXiv: 2605.01852 by Fumio Okura, Hiroaki Santo, Kohei Ashida, Lilika Makabe, Yasuyuki Matsushita.

**Figure 2.** Figure 2: Overview of our method that provides the absolute scale to multi-view 3D reconstruction (SfM and MVS) just by [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Example of DP images captured with different lenses. [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: Visual results of real scenes captured by the DSLR (Canon EOS 5D Mark IV) and Phone (Pixel 4). Our method yields [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: Visual results of the estimated blur sizes using the [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗

read the original abstract

Multi-view 3D reconstruction, namely, structure-from-motion followed by multi-view stereo, is a fundamental component of 3D computer vision. In general, multi-view 3D reconstruction suffers from an unknown scale ambiguity unless a reference object of known size is present in the scene. In this article, we show that multi-view images captured using a dual-pixel (DP) sensor can automatically resolve the scale ambiguity, without requiring a reference object or prior calibration. Specifically, the defocus blur observed in DP images provides sufficient information to determine the absolute scale when paired with depth maps (up to scale) recovered from multi-view 3D reconstruction. Based on this observation, we develop a simple yet effective linear method to estimate the absolute scale, followed by the intensity-based optimization stage that aligns the left and right DP images by shifting them back toward each other using cross-view blur kernels. Experiments demonstrate the effectiveness of the proposed approach across diverse scenes captured with different cameras and lenses. Code and data are available at https://github.com/lilika-makabe/dp-sfm-tpami.git

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

DP-SfM recovers absolute scale from dual-pixel defocus blur paired with up-to-scale SfM depths, and the linear-plus-refinement pipeline holds up in real tests.

read the letter

This paper shows that dual-pixel images can resolve the scale ambiguity in structure-from-motion using only the defocus blur already present in the sensor, without any reference object or extra calibration. The authors turn the left-right disparity into a direct cue for absolute depth once the SfM point cloud is available up to scale, then solve for the scale factor with a linear estimator before a final intensity-based alignment step using cross-view blur kernels.

Referee Report

2 major / 3 minor

Summary. The paper claims that dual-pixel (DP) sensors in multi-view images can resolve the scale ambiguity inherent in structure-from-motion (SfM) reconstructions. It shows that defocus blur observed in DP left/right views supplies an absolute-scale constraint when combined with up-to-scale depth maps from SfM, via a linear estimator for the scale factor followed by an intensity-based optimization that aligns the views by shifting them according to cross-view blur kernels. Experiments on diverse scenes with multiple cameras and lenses are reported to validate the approach, with code and data released.

Significance. If the central claim holds, the work provides a practical, calibration-free route to metric 3D reconstruction that exploits hardware already present in many consumer cameras. The linear formulation, explicit use of the thin-lens defocus model, and release of reproducible code strengthen the contribution relative to prior scale-recovery techniques that require known objects or additional sensors.

major comments (2)

[§4.2, Eq. (7)] §4.2, Eq. (7): the linear scale estimator assumes that the observed DP disparity is exactly proportional to the reciprocal of the SfM depth scaled by the unknown factor s; however, the derivation does not explicitly bound the error introduced when the thin-lens approximation deviates from the actual lens (e.g., spherical aberration or aperture-dependent effects). A sensitivity analysis or synthetic ablation under realistic lens models would be needed to confirm that the estimator remains unbiased.
[§5.3, Table 2] §5.3, Table 2: the reported scale-error reductions are shown only for scenes where the DP baseline is non-zero and the focal plane is within the depth range; no quantitative results are given for the failure regime where the entire scene lies at the focal plane (zero blur), which would make the linear system singular. Clarifying the practical operating range is load-bearing for the claim of automatic scale recovery.

minor comments (3)

The notation for the left/right DP images (I_L, I_R) and the corresponding blur kernels is introduced without a clear diagram; adding a figure that illustrates the cross-view shift and kernel alignment would improve readability.
[§4] Several equations in §4 use the symbol d for both scene depth and DP disparity; a brief disambiguation sentence or consistent subscripting would prevent confusion.
The abstract states that the method works 'without requiring a reference object or prior calibration,' yet the experiments implicitly rely on known camera intrinsics from the SfM stage; a short clarification in the introduction would align the claim with the actual pipeline.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive evaluation and the recommendation for minor revision. The comments highlight important aspects of the method's assumptions and operating range. We address each point below and will revise the manuscript accordingly to incorporate clarifications and additional analysis.

read point-by-point responses

Referee: [§4.2, Eq. (7)] §4.2, Eq. (7): the linear scale estimator assumes that the observed DP disparity is exactly proportional to the reciprocal of the SfM depth scaled by the unknown factor s; however, the derivation does not explicitly bound the error introduced when the thin-lens approximation deviates from the actual lens (e.g., spherical aberration or aperture-dependent effects). A sensitivity analysis or synthetic ablation under realistic lens models would be needed to confirm that the estimator remains unbiased.

Authors: The linear estimator in §4.2 is derived under the thin-lens model, which is the standard approximation employed throughout the dual-pixel and defocus literature. While real lenses can introduce higher-order effects such as spherical aberration, the multi-camera, multi-lens experiments in §5 demonstrate consistent and accurate scale recovery on real data. To directly address the concern about potential bias, we will add a sensitivity analysis using synthetic data rendered with more realistic lens models (e.g., incorporating spherical aberration) in the revised manuscript. revision: yes
Referee: [§5.3, Table 2] §5.3, Table 2: the reported scale-error reductions are shown only for scenes where the DP baseline is non-zero and the focal plane is within the depth range; no quantitative results are given for the failure regime where the entire scene lies at the focal plane (zero blur), which would make the linear system singular. Clarifying the practical operating range is load-bearing for the claim of automatic scale recovery.

Authors: We agree that zero defocus blur (entire scene at the focal plane) makes the linear system singular, as no cross-view disparity information is present. This is an inherent limitation of any defocus-based scale recovery method. The manuscript's experiments and claims focus on scenes exhibiting sufficient defocus, consistent with the problem setting. To clarify the practical operating range, we will add an explicit discussion in §5.3 stating the requirement for non-zero blur and noting the singular failure case. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained

full rationale

The paper's core derivation estimates absolute scale via a linear solver that combines observed DP defocus disparity (from the thin-lens model) with up-to-scale SfM depths; the scale factor is the unknown solved for, not presupposed. The follow-on kernel-alignment stage is a direct consequence of the recovered scale and does not feed back into the scale estimate. No equations reduce a prediction to a fitted input by construction, no uniqueness theorems are imported via self-citation, and no ansatz is smuggled in. The approach is externally falsifiable on real multi-camera data and remains independent of its own outputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Based on abstract only; the method likely relies on standard computer vision assumptions about image formation and blur, with no new entities introduced.

axioms (1)

domain assumption Standard pinhole camera model and defocus blur kernel assumptions hold for the DP sensor.
Implicit in the use of DP images for depth and blur.

pith-pipeline@v0.9.0 · 5508 in / 1138 out tokens · 64954 ms · 2026-05-10T14:54:20.908997+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

48 extracted references · 3 canonical work pages · 2 internal anchors

[1]

Hartley and A

R. Hartley and A. Zisserman,Multiple View Geometry in Computer Vision, 2nd ed. Cambridge University Press, ISBN: 0521540518, 2004

2004
[2]

Resolving Scale Ambiguity in Multi-view 3D Reconstruction Using Dual-Pixel Sensors,

K. Ashida, H. Santo, F. Okura, and Y . Matsushita, “Resolving Scale Ambiguity in Multi-view 3D Reconstruction Using Dual-Pixel Sensors,” inProceedings of European Conference on Computer Vision (ECCV), 2024, pp. 162–178

2024
[3]

Modeling Defocus-Disparity in Dual-Pixel Sensors,

A. Punnappurath, A. Abuolaim, M. Afifi, and M. S. Brown, “Modeling Defocus-Disparity in Dual-Pixel Sensors,” inInternational Conference on Computational Photography (ICCP), 2020, pp. 1–12

2020
[4]

Bayesian scale estimation for monocular SLAM based on generic object detection for correcting scale drift,

E. Sucar and J.-B. Hayet, “Bayesian scale estimation for monocular SLAM based on generic object detection for correcting scale drift,” in IEEE International Conference on Robotics and Automation (ICRA), 2018, pp. 5152–5158

2018
[5]

Recovering stable scale in monocular SLAM using object-supplemented bundle adjustment,

D. Frost, V . Prisacariu, and D. Murray, “Recovering stable scale in monocular SLAM using object-supplemented bundle adjustment,”IEEE Transactions on Robotics, vol. 34, no. 3, pp. 736–747, 2018

2018
[6]

Robust Scale Estimation in Real-Time Monocular SFM for Autonomous Driving,

S. Song and M. Chandraker, “Robust Scale Estimation in Real-Time Monocular SFM for Autonomous Driving,” inProceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2014, pp. 1566–1573

2014
[7]

Reliable scale estimation and correction for monocular visual odometry,

D. Zhou, Y . Dai, and H. Li, “Reliable scale estimation and correction for monocular visual odometry,” inProceedings of IEEE Intelligent V ehicles Symposium (IV), 2016, pp. 490–495. JOURNAL OF LATEX CLASS FILES, 2026 8

2016
[8]

Ground-plane-based absolute scale estimation for monocular visual odometry,

——, “Ground-plane-based absolute scale estimation for monocular visual odometry,”IEEE Transactions on Intelligent Transportation Sys- tems, vol. 21, no. 2, pp. 791–802, 2019

2019
[9]

Leveraging the user’s face for absolute scale estimation in handheld monocular SLAM,

S. B. Knorr and D. Kurz, “Leveraging the user’s face for absolute scale estimation in handheld monocular SLAM,” inProceedings of IEEE International Symposium on Mixed and Augmented Reality (ISMAR), 2016, pp. 11–17

2016
[10]

Monocular depth estimation in new environments with absolute scale,

T. Roussel, L. Van Eycken, and T. Tuytelaars, “Monocular depth estimation in new environments with absolute scale,” inIEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2019, pp. 1735–1741

2019
[11]

Estimation of absolute scale in monocular SLAM using synthetic data,

D. Rukhovich, D. Mouritzen, R. Kaestner, M. Rufli, and A. Velizhev, “Estimation of absolute scale in monocular SLAM using synthetic data,” inProceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2019, pp. 803–812

2019
[12]

ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth

S. F. Bhat, R. Birkl, D. Wofk, P. Wonka, and M. M ¨uller, “Zoedepth: Zero-shot transfer by combining relative and metric depth,”arXiv preprint arXiv:2302.12288, 2023

work page internal anchor Pith review arXiv 2023
[13]

Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data,

L. Yang, B. Kang, Z. Huang, X. Xu, J. Feng, and H. Zhao, “Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data,” in Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 10 371–10 381

2024
[14]

Depth Anything V2

L. Yang, B. Kang, Z. Huang, Z. Zhao, X. Xu, J. Feng, and H. Zhao, “Depth Anything V2,”arXiv:2406.09414, 2024

work page internal anchor Pith review arXiv 2024
[15]

Large-scale direct SLAM with stereo cameras,

J. Engel, J. St ¨uckler, and D. Cremers, “Large-scale direct SLAM with stereo cameras,” inIEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2015, pp. 1935–1942

2015
[16]

Scale Estimation of Monocular SfM for a Multi-modal Stereo Camera,

S. Sumikura, K. Sakurada, N. Kawaguchi, and R. Nakamura, “Scale Estimation of Monocular SfM for a Multi-modal Stereo Camera,” in Proceedings of Asian Conference on Computer Vision (ACCV), 2019, pp. 281–297

2019
[17]

Scale correct monocular visual odometry using a lidar altimeter,

R. Giubilato, S. Chiodini, M. Pertile, and S. Debei, “Scale correct monocular visual odometry using a lidar altimeter,” inIEEE/RSJ In- ternational Conference on Intelligent Robots and Systems (IROS), 2018, pp. 3694–3700

2018
[18]

Fusion of IMU and vision for absolute scale estimation in monocular SLAM,

G. N ¨utzi, S. Weiss, D. Scaramuzza, and R. Siegwart, “Fusion of IMU and vision for absolute scale estimation in monocular SLAM,”Journal of Intelligent & Robotic Systems, vol. 61, no. 1, pp. 287–299, 2011

2011
[19]

Towards scale-aware, robust, and generalizable unsupervised monocular depth estimation by integrating IMU motion dynamics,

S. Zhang, J. Zhang, and D. Tao, “Towards scale-aware, robust, and generalizable unsupervised monocular depth estimation by integrating IMU motion dynamics,” inProceedings of European Conference on Computer Vision (ECCV), 2022, pp. 143–160

2022
[20]

Absolute scale in structure from motion from a single vehicle mounted camera by exploiting nonholonomic constraints,

D. Scaramuzza, F. Fraundorfer, M. Pollefeys, and R. Siegwart, “Absolute scale in structure from motion from a single vehicle mounted camera by exploiting nonholonomic constraints,” inProceedings of IEEE/CVF International Conference on Computer Vision (ICCV), 2009, pp. 1413– 1419

2009
[21]

Stability-based scale estimation for monocular SLAM,

S. H. Lee and G. de Croon, “Stability-based scale estimation for monocular SLAM,”IEEE Robotics and Automation Letters, vol. 3, no. 2, pp. 780–787, 2018

2018
[22]

Scale- reconstructable structure from motion using refraction with a single camera,

A. Shibata, H. Fujii, A. Yamashita, and H. Asama, “Scale- reconstructable structure from motion using refraction with a single camera,” inIEEE International Conference on Robotics and Automation (ICRA), 2015, pp. 5239–5244

2015
[23]

Absolute scale structure from motion using a refractive plate,

——, “Absolute scale structure from motion using a refractive plate,” inProceedings of IEEE/SICE International Symposium on System Inte- gration (SII), 2015, pp. 540–545

2015
[24]

Monoc- ular 3D scene reconstruction at absolute scale,

C. W ¨ohler, P. d’Angelo, L. Kr¨uger, A. Kuhl, and H.-M. Groß, “Monoc- ular 3D scene reconstruction at absolute scale,”ISPRS Journal of Photogrammetry and Remote Sensing, vol. 64, no. 6, pp. 529–540, 2009

2009
[25]

Eliminating scale drift in monocular SLAM using depth from defocus,

T. Shiozaki and G. Dissanayake, “Eliminating scale drift in monocular SLAM using depth from defocus,”IEEE Robotics and Automation Letters, vol. 3, no. 1, pp. 581–587, 2017

2017
[26]

Absolute Scale from Varifocal Monocular Camera through SfM and Defocus Combined,

N. Mishima, A. Seki, and S. Hiura, “Absolute Scale from Varifocal Monocular Camera through SfM and Defocus Combined,” inProceed- ings of British Machine Vision Conference (BMVC), 2021, pp. 28–28

2021
[27]

Deep Depth From Aberration Map,

M. Kashiwagi, N. Mishima, T. Kozakaya, and S. Hiura, “Deep Depth From Aberration Map,” inProceedings of IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 4070–4079

2019
[28]

Revisiting Autofocus for Smartphone Cameras,

A. Abuolaim, A. Punnappurath, and M. S. Brown, “Revisiting Autofocus for Smartphone Cameras,” inProceedings of European Conference on Computer Vision (ECCV), 2018, pp. 545–559

2018
[29]

Learning to Autofocus,

C. Herrmann, R. S. Bowen, N. Wadhwa, R. Garg, Q. He, J. T. Barron, and R. Zabih, “Learning to Autofocus,” inProceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 2227–2236

2020
[30]

Defocus Deblurring Using Dual-Pixel Data,

A. Abuolaim and M. S. Brown, “Defocus Deblurring Using Dual-Pixel Data,” inProceedings of European Conference on Computer Vision (ECCV), 2020, pp. 111–126

2020
[31]

Dual Pixel Exploration: Simultaneous Depth Estimation and Image Restoration,

L. Pan, S. Chowdhury, R. Hartley, M. Liu, H. Zhang, and H. Li, “Dual Pixel Exploration: Simultaneous Depth Estimation and Image Restoration,” inProceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 4340–4349

2021
[32]

NTIRE 2021 Challenge for Defocus Deblurring Using Dual-Pixel Images: Methods and Results,

A. Abuolaim, R. Timofte, and M. S. Brown, “NTIRE 2021 Challenge for Defocus Deblurring Using Dual-Pixel Images: Methods and Results,” in Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2021, pp. 578–587

2021
[33]

Learning To Reduce Defocus Blur by Realistically Modeling Dual- Pixel Data,

A. Abuolaim, M. Delbracio, D. Kelly, M. S. Brown, and P. Milanfar, “Learning To Reduce Defocus Blur by Realistically Modeling Dual- Pixel Data,” inProceedings of IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 2289–2298

2021
[34]

Defocus Map Estimation and Deblurring From a Single Dual-Pixel Image,

S. Xin, N. Wadhwa, T. Xue, J. T. Barron, P. P. Srinivasan, J. Chen, I. Gkioulekas, and R. Garg, “Defocus Map Estimation and Deblurring From a Single Dual-Pixel Image,” inProceedings of IEEE/CVF Inter- national Conference on Computer Vision (ICCV), 2021, pp. 2228–2238

2021
[35]

K3DN: Disparity-Aware Ker- nel Estimation for Dual-Pixel Defocus Deblurring,

Y . Yang, L. Pan, L. Liu, and M. Liu, “K3DN: Disparity-Aware Ker- nel Estimation for Dual-Pixel Defocus Deblurring,” inProceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 13 263–13 272

2023
[36]

Improving Single-Image Defocus Deblurring: How Dual-Pixel Images Help Through Multi- Task Learning,

A. Abuolaim, M. Afifi, and M. S. Brown, “Improving Single-Image Defocus Deblurring: How Dual-Pixel Images Help Through Multi- Task Learning,” inIEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2022, pp. 82–90

2022
[37]

Disparity probability volume guided defocus deblurring using dual pixel data,

S. H. Jung and Y . S. Heo, “Disparity probability volume guided defocus deblurring using dual pixel data,” inProceedings of International Con- ference on Information and Communication Technology Convergence (ICTC), 2021, pp. 305–308

2021
[38]

Synthetic depth-of-field with a single-camera mobile phone,

N. Wadhwa, R. Garg, D. E. Jacobs, B. E. Feldman, N. Kanazawa, R. Carroll, Y . Movshovitz-Attias, J. T. Barron, Y . Pritch, and M. Levoy, “Synthetic depth-of-field with a single-camera mobile phone,”ACM Transactions on Graphics (TOG), pp. 1–13, 2018

2018
[40]

Spatio-Focal Bidirectional Disparity Estimation From a Dual-Pixel Image,

D. Kim, H. Jang, I. Kim, and M. H. Kim, “Spatio-Focal Bidirectional Disparity Estimation From a Dual-Pixel Image,” inProceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 5023–5032

2023
[41]

Learning to Synthesize Photorealistic Dual-pixel Images from RGBD frames,

F. Li, H. Guo, H. Santo, F. Okura, and Y . Matsushita, “Learning to Synthesize Photorealistic Dual-pixel Images from RGBD frames,” in International Conference on Computational Photography (ICCP), 2023, pp. 1–11

2023
[42]

Reflection Removal Using a Dual- Pixel Sensor,

A. Punnappurath and M. S. Brown, “Reflection Removal Using a Dual- Pixel Sensor,” inProceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 1556–1565

2019
[43]

Facial Depth and Normal Estimation using Single Dual-Pixel Camera,

M. Kang, J. Choe, H. Ha, H.-G. Jeon, S. Im, and I. S. Kweon, “Facial Depth and Normal Estimation using Single Dual-Pixel Camera,” in Proceedings of European Conference on Computer Vision (ECCV), 2022, pp. 181–200

2022
[44]

Du2Net: Learning Depth Estimation from Dual-Cameras and Dual-Pixels,

Y . Zhang, N. Wadhwa, S. Orts-Escolano, C. H ¨ane, S. R. Fanello, and R. Garg, “Du2Net: Learning Depth Estimation from Dual-Cameras and Dual-Pixels,” inProceedings of European Conference on Computer Vision (ECCV), 2020, pp. 582–598

2020
[45]

Gentle,Matrix Albegra, ser

J. Gentle,Matrix Albegra, ser. Springer Texts in Statistics. Springer, New York, 2007

2007
[46]

Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography,

M. A. Fischler and R. C. Bolles, “Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography,”Communications of the ACM, vol. 24, no. 6, pp. 381–395, 1981

1981
[47]

Learning Single Cam- era Depth Estimation using Dual-Pixels,

R. Garg, N. Wadhwa, S. Ansari, and J. T. Barron, “Learning Single Cam- era Depth Estimation using Dual-Pixels,” inProceedings of IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 7628– 7637

2019
[48]

UniDepth: Universal monocular metric depth estimation,

L. Piccinelli, Y .-H. Yang, C. Sakaridis, M. Segu, S. Li, L. Van Gool, and F. Yu, “UniDepth: Universal monocular metric depth estimation,” in Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024

2024
[49]

Unidepthv2: Universal monocular metric depth estimation made simpler

L. Piccinelli, C. Sakaridis, Y .-H. Yang, M. Segu, S. Li, W. Abbeloos, and L. V . Gool, “UniDepthV2: Universal monocular metric depth estimation made simpler,”arXiv preprint arXiv:2502.20110, 2025

work page arXiv 2025