arxiv: 2605.08213 · v1 · submitted 2026-05-06 · 💻 cs.CV

Recognition: no theorem link

Low-Cost Stereo Vision for Robust 3D Positioning of Thin Radiata Pine Branches in Autonomous Drone Pruning

Yida Lin , Bing Xue , Mengjie Zhang , Sam Schofield , Richard Green

Authors on Pith no claims yet

Pith reviewed 2026-05-12 00:45 UTC · model grok-4.3

classification 💻 cs.CV

keywords stereo visiondrone pruningbranch segmentationradiata pinedepth estimation3D positioningautonomous forestryYOLO segmentation

0 comments

The pith

A low-cost stereo camera on a drone can locate thin 10 mm pine branches in 3D for autonomous pruning without extra sensors.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests if one inexpensive stereo camera mounted on a drone can replace costly LiDAR or similar sensors for detecting and positioning radiata pine branches as thin as 10 mm. It builds a two-stage pipeline that first segments branches with YOLO models on a custom set of 71 stereo pairs, then estimates depth and converts the results into a single reliable distance using centroid triangulation plus median-absolute-deviation outlier rejection. This matters for New Zealand forestry, where manual pruning is dangerous and labor is scarce, and cheaper drone systems could expand to thinner branches that current platforms avoid. Learning-based stereo depth methods produced more coherent maps than traditional ones at 1-2 m distances, though only qualitative checks were shown. The work claims this combination removes the need for auxiliary depth hardware while handling the sparse textures and noise typical of forest scenes.

Core claim

By pairing real-time segmentation masks from YOLOv8 or YOLOv9 with disparity maps from deep stereo networks such as RAFT-Stereo or ACVNet, then applying centroid extraction and median-absolute-deviation filtering to the resulting 3D points, the system yields a robust per-branch distance estimate that supports pruning operations on branches down to 10 mm thickness at typical drone working distances.

What carries the argument

Centroid-based triangulation with Median-Absolute-Deviation outlier rejection that converts a segmentation mask and disparity map into one reliable branch-to-camera distance.

If this is right

Autonomous pruning platforms can drop expensive auxiliary depth sensors and still target branches as thin as 10 mm.
The same segmentation-plus-centroid pipeline can be swapped to newer YOLO releases without redesigning the depth or positioning stages.
Forestry-specific fine-tuning of stereo networks is required because urban driving benchmarks leave a noticeable domain gap in natural scenes.
Real-time operation on drones becomes feasible once the chosen stereo and segmentation models run at camera frame rates.
Sparse-texture handling improves when learning-based disparity methods replace classical block-matching approaches.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the same pipeline were tested on moving branches or under wind, the outlier rejection step might need additional temporal filtering to stay reliable.
The approach could transfer to other thin linear structures such as power lines or vineyard wires once domain-specific data is collected.
Coupling the 3D branch positions directly to a pruning end-effector would close the loop from perception to action without separate mapping steps.
Quantitative error budgets tied to branch diameter would let future work set clear accuracy targets instead of relying on visual inspection.

Load-bearing premise

Qualitative visual comparisons of depth maps at 1-2 m distances on a 71-pair custom dataset are sufficient to establish that the positioning accuracy meets the requirements for autonomous pruning of 10 mm branches in real operational conditions.

What would settle it

A quantitative measurement campaign that records average 3D positioning error larger than half the branch diameter (5 mm) or more than 10 percent of range on thin branches at 1-2 m would show the method does not yet meet pruning needs.

Figures

Figures reproduced from arXiv: 2605.08213 by Bing Xue, Mengjie Zhang, Richard Green, Sam Schofield, Yida Lin.

**Figure 1.** Figure 1: Triangulation using two cameras to obtain the depth map. The point (ul , vl) is the projection of point p(x, y, z) onto the left image plane, and (ur, vr) is the projection of the same point onto the right image plane; b is the baseline distance between the two cameras. Symbols are summarised in [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗

**Figure 2.** Figure 2: Triangulation geometry showing the focal length f. Given f, the baseline b, and the disparity of p(x, y, z) between the left and right images, the depth z of p can be recovered. Symbols are summarised in [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: summarises the overall pipeline. A rectified stereo pair from the ZED Mini camera is processed in two parallel branches: the left image is passed through an instance segmentation network (Mask R-CNN or a YOLO segmentor) to obtain a binary branch mask, and the same stereo pair is passed through a disparity estimator (SGBM with WLS filtering, or one of the deep stereo networks) to obtain a dense disparity ma… view at source ↗

**Figure 4.** Figure 4: Block Matching illustration. L denotes the template window and T the search scan line. The left image El serves as the reference, and the corresponding pixel is located in Er. The multiple-window variant considers N shifted windows Tk: Cmulti(x, y, d) = X N k=1 X (i,j)∈Tk C(i, j, d) (16) Iterative diffusion propagates costs across the neighbourhood over N iterations with weights wn: Cdif f (x, y, d) = X N … view at source ↗

**Figure 3.** Figure 3: PSMNet, ACVNet, GWCNet, MobileStereoNet, RAFT-Stereo, and NeRF [PITH_FULL_IMAGE:figures/full_fig_p011_3.png] view at source ↗

**Figure 5.** Figure 5: Centroid-based branch localisation: (a) predicted boundary points around the branch, (b) the closest points grouped into triangles, and (c) the resulting centroid sample locations. 3.5.0.2. Neighbourhood expansion.. Each centroid is expanded by m neighbouring sample points to reduce the variance of the depth estimate. The j-th expanded point around centroid p ′ i is denoted qi,j = (x ′′ i,j , y′′ i,j ). D… view at source ↗

**Figure 6.** Figure 6: shows the SGBM with WLS pipeline applied to a representative stereo pair from the branch dataset. The raw SGBM output (e) is sharp on textured surfaces but exhibits the well-known failure modes of local stereo on thin branches: streaking artefacts, background bleed, and missing pixels along the silhouette. Adding WLS post-filtering (f) propagates valid disparities along strong image edges, visibly cleaning… view at source ↗

**Figure 7.** Figure 7: Depth maps generated by MiDaS and Depth Anything at branch–camera distances of 1 m, 1.5 m, and 2 m. Note the lack of metric scale change between rows. on KITTI 2015, and pre-trained on Scene Flow. Each variant is then applied to the branch dataset. (a) original left image (b) Scene Flow pre-trained (c) KITTI 2012 pre-trained (d) KITTI 2012, 100 epochs (e) KITTI 2015 pre-trained (f) KITTI 2015, 100 epochs … view at source ↗

**Figure 8.** Figure 8: PSMNet disparity on the same branch input under different pre-training and fine-tuning regimes. Two patterns emerge in [PITH_FULL_IMAGE:figures/full_fig_p017_8.png] view at source ↗

**Figure 9.** Figure 9: compares the disparity maps produced by all six deep stereo networks on the same branch input. (a) PSMNet (b) ACVNet (c) GWCNet (d) MobileStereoNet (e) RAFT-Stereo (f) NeRF-Stereo [PITH_FULL_IMAGE:figures/full_fig_p018_9.png] view at source ↗

**Figure 10.** Figure 10: Branch depth detection results: (a) detected points overlaid on the depth map after centroid sampling and MAD filtering, and (b) the corresponding points on the RGB image. (a) SGBM depth 1m (b) SGBM depth 1.5m (c) SGBM depth 2m (d) NeRF-Stereo depth 1m (e) NeRF-Stereo depth 1.5m (f) NeRF-Stereo depth 2m [PITH_FULL_IMAGE:figures/full_fig_p019_10.png] view at source ↗

**Figure 11.** Figure 11: Depth estimation around the branch using SGBM (top row) and NeRF-Stereo (bottom row) at branch–camera distances of 1 m, 1.5 m, and 2 m. (a) SGBM histogram 1m (b) SGBM histogram 1.5m (c) SGBM histogram 2m (d) NeRF-Stereo histogram 1m (e) NeRF-Stereo histogram 1.5m (f) NeRF-Stereo histogram 2m [PITH_FULL_IMAGE:figures/full_fig_p019_11.png] view at source ↗

**Figure 12.** Figure 12: Histograms of branch-pixel depth values for SGBM (top row) and NeRF-Stereo (bottom row) at 1 m, 1.5 m, and 2 m. 19 [PITH_FULL_IMAGE:figures/full_fig_p019_12.png] view at source ↗

read the original abstract

Manual pruning of radiata pine, a species of major economic importance to New Zealand forestry, is hazardous, labour-intensive, and increasingly constrained by workforce shortages. Existing autonomous pruning platforms typically rely on expensive sensors such as LiDAR and are limited to thick branches, which restricts their wider adoption. This paper investigates whether a single low-cost stereo camera mounted on a drone can provide sufficiently accurate branch detection and three-dimensional positioning to support autonomous pruning of branches as thin as 10 mm, thereby removing the need for auxiliary depth sensors. The proposed pipeline comprises two stages: branch segmentation and depth estimation. For segmentation, Mask R-CNN variants and the YOLOv8 and YOLOv9 families are compared on a custom dataset of 71 stereo image pairs captured with a ZED Mini camera; YOLOv8 and YOLOv9 are selected as representative state-of-the-art real-time segmentors at the time of data collection, and the framework is designed to remain compatible with newer YOLO releases. For depth estimation, a traditional method (SGBM with WLS filtering) and deep-learning-based methods (PSMNet, ACVNet, GWCNet, MobileStereoNet, RAFT-Stereo, and NeRF-Supervised Deep Stereo) are evaluated, including cross-dataset fine-tuning experiments that expose the domain gap between urban driving benchmarks and natural forestry scenes. The main novelty of this work lies in coupling stereo segmentation with a centroid-based triangulation algorithm and Median-Absolute-Deviation outlier rejection that converts a segmentation mask and disparity map into a single robust branch-to-camera distance, addressing the challenges of sparse texture, thin structures, and noisy disparity values typical of forest scenes. Qualitative evaluations at distances of 1-2 m show that the learning-based stereo methods produce more coherent depth es...

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper adapts stereo segmentation with centroid-MAD triangulation for thin forest branches but rests its accuracy claims on qualitative depth maps without measured errors.

read the letter

This paper adapts stereo segmentation with centroid-MAD triangulation for thin forest branches but rests its accuracy claims on qualitative depth maps without measured errors. They collected 71 stereo pairs with a ZED Mini in a radiata pine setting and compared YOLO variants for segmentation against several stereo networks including RAFT-Stereo. The main addition is running the segmentation mask through a centroid step, pulling disparity values, and applying median absolute deviation to reject outliers before triangulation. That produces one distance per branch and is a reasonable response to noisy disparities on thin, low-texture objects in natural scenes. The comparisons also usefully flag the domain gap between driving benchmarks and forestry data, with some fine-tuning results shown. The qualitative depth maps at 1-2 m look coherent for the learning-based methods. The central limitation is the missing quantitative validation. No ground-truth distances are reported, no MAE or RMSE on the final scalar branch distances, and no tolerance analysis tied to 10 mm branch radius or drone motion. The evaluations stay at visual inspection of disparity maps on a small static dataset. That leaves the claim that auxiliary sensors can be dropped unsupported, even if the pipeline itself is described clearly. This is aimed at researchers building drone systems for forestry or similar outdoor agricultural tasks. A reader working on practical vision for natural environments could use the model comparisons and the proposed combination as a starting point. It deserves a serious referee because the problem is relevant and the method is a grounded adaptation of existing tools. I would send it to review with the expectation that the authors add error metrics and ground-truth checks before it could be considered complete.

Referee Report

1 major / 2 minor

Summary. The paper claims that a single low-cost stereo camera (ZED Mini) mounted on a drone can deliver sufficiently accurate branch detection and 3D positioning for thin (10 mm) radiata pine branches to enable autonomous pruning without auxiliary depth sensors such as LiDAR. The proposed pipeline uses instance segmentation (Mask R-CNN, YOLOv8, YOLOv9) on a custom 71-pair stereo dataset, followed by depth estimation (SGBM, PSMNet, RAFT-Stereo and others) and a novel centroid-based triangulation step with Median-Absolute-Deviation outlier rejection to convert masks and disparity maps into a single robust branch-to-camera distance; qualitative visual comparisons at 1-2 m distances are presented to support the accuracy claim.

Significance. If the mm-scale positioning accuracy were quantitatively validated, the work would be significant for low-cost automation in New Zealand forestry by reducing reliance on expensive sensors and extending pruning capability to thinner branches. The comparison of segmentation and stereo models on challenging forest scenes with sparse texture, plus the practical centroid-MAD post-processing, provides a useful baseline for future drone-based systems.

major comments (1)

[Abstract and Evaluation section] Abstract and Evaluation section: The central claim that the pipeline provides 'sufficiently accurate' 3D positioning for 10 mm branches (thereby removing the need for auxiliary sensors) is not supported by any quantitative evidence. Only qualitative visual comparisons of depth maps and segmentation masks at 1-2 m on the 71-pair custom dataset are reported; no ground-truth branch-to-camera distances, MAE/RMSE on the final triangulated scalar distances, error bars, tolerance analysis relative to branch radius, or drone-mounted repeatability tests appear. This directly undermines the strongest claim.

minor comments (2)

[Pipeline description (Section 3)] The description of the centroid-MAD triangulation and outlier rejection would be clearer with explicit equations or pseudocode showing how the segmentation mask and disparity map are reduced to a single distance value.
[Abstract] The abstract sentence on learning-based stereo methods is truncated ('more coherent depth es...'); complete it in the revision.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the major comment below, clarifying our evaluation choices while committing to revisions that temper claims and add context without misrepresenting the work.

read point-by-point responses

Referee: [Abstract and Evaluation section] Abstract and Evaluation section: The central claim that the pipeline provides 'sufficiently accurate' 3D positioning for 10 mm branches (thereby removing the need for auxiliary sensors) is not supported by any quantitative evidence. Only qualitative visual comparisons of depth maps and segmentation masks at 1-2 m on the 71-pair custom dataset are reported; no ground-truth branch-to-camera distances, MAE/RMSE on the final triangulated scalar distances, error bars, tolerance analysis relative to branch radius, or drone-mounted repeatability tests appear. This directly undermines the strongest claim.

Authors: We agree that quantitative metrics such as MAE/RMSE on the final triangulated distances would provide stronger support for the accuracy claim. The manuscript focuses on qualitative visual comparisons because obtaining precise ground-truth 3D positions for thin 10 mm branches in unstructured forest scenes is practically challenging without auxiliary sensors that would undermine the low-cost premise. The centroid-MAD triangulation is presented as a practical post-processing step that yields coherent positions despite noisy disparities. We will revise the abstract and evaluation sections to explicitly qualify the results as qualitative, moderate the phrasing of 'sufficiently accurate,' and add a tolerance discussion relative to branch radius (e.g., positioning error acceptable if within 5 mm for pruning contact). This addresses the concern while preserving the contribution as a baseline for low-cost drone systems. revision: partial

Circularity Check

0 steps flagged

No circularity: experimental comparison of off-the-shelf models on new data

full rationale

The manuscript evaluates standard segmentation networks (Mask R-CNN, YOLOv8/9) and stereo depth estimators (SGBM, PSMNet, RAFT-Stereo, etc.) on a newly collected 71-pair ZED Mini dataset. The final distance is obtained via a centroid-MAD triangulation step that applies a conventional statistical outlier rule to disparity values; this step is not fitted to the target distances and does not redefine any reported quantity in terms of itself. No equations, uniqueness theorems, or self-citations reduce the claimed positioning accuracy to a tautology or to parameters optimized on the same evaluation set. The work therefore remains an independent empirical comparison.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard computer-vision assumptions about stereo correspondence in natural scenes and the representativeness of a small custom dataset; no new physical entities or ad-hoc constants are introduced beyond typical hyperparameters in the cited networks.

axioms (2)

domain assumption Disparity maps from stereo matching can be converted to metric distances via known camera intrinsics and baseline
Invoked in the centroid-based triangulation step for branch-to-camera distance.
domain assumption A 71-pair custom dataset captured with ZED Mini is representative of operational forestry conditions for thin branches
Underlies all reported qualitative evaluations and model selection.

pith-pipeline@v0.9.0 · 5645 in / 1508 out tokens · 37024 ms · 2026-05-12T00:45:15.348431+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

55 extracted references · 55 canonical work pages

[1]

Robotics and Biomimetics , volume=

Aerial pruning mechanism, initial real environment test , author=. Robotics and Biomimetics , volume=. 2017 , publisher=

work page 2017
[2]

iForest - Biogeosciences and Forestry , vol =

MP Fernandez and J Basauri and C Madariaga and M Menendez-Miguelez and R Olea and A Zubizarreta-Gerendiain , title =. iForest - Biogeosciences and Forestry , vol =. 2017 , URL =. https://iforest.sisef.org/pdf/?id=ifor2037-009 , doi =

work page 2017
[3]

New Zealand Journal of Forestry Science , volume=

Impacts of tending on attributes of radiata pine trees and stands in New Zealand--a review , author=. New Zealand Journal of Forestry Science , volume=

work page
[4]

Canadian Journal of Forest Research , volume=

Effects of green pruning on growth of Pinus radiata , author=. Canadian Journal of Forest Research , volume=. 2003 , publisher=

work page 2003
[5]

Don) , author=

Effects of thinning and pruning on stem and crown characteristics of radiata pine (Pinus radiata D. Don) , author=. iForest-Biogeosciences and Forestry , volume=. 2017 , publisher=

work page 2017
[6]

Drones , volume=

Tree Branch Skeleton Extraction from Drone-Based Photogrammetric Point Cloud , author=. Drones , volume=. 2023 , publisher=

work page 2023
[7]

Materials Today: Proceedings , volume=

Non-contact type tree branch cutter using drone attached with laser head , author=. Materials Today: Proceedings , volume=. 2022 , publisher=

work page 2022
[8]

HortTechnology , volume=

An apple tree branch pruning analysis , author=. HortTechnology , volume=. 2022 , publisher=

work page 2022
[9]

Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

Rich feature hierarchies for accurate object detection and semantic segmentation , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

work page
[10]

IEEE transactions on pattern analysis and machine intelligence , volume=

Spatial pyramid pooling in deep convolutional networks for visual recognition , author=. IEEE transactions on pattern analysis and machine intelligence , volume=. 2015 , publisher=

work page 2015
[11]

Proceedings of the IEEE international conference on computer vision , pages=

Fast r-cnn , author=. Proceedings of the IEEE international conference on computer vision , pages=

work page
[12]

IEEE transactions on pattern analysis and machine intelligence , volume=

Faster R-CNN: Towards real-time object detection with region proposal networks , author=. IEEE transactions on pattern analysis and machine intelligence , volume=. 2016 , publisher=

work page 2016
[13]

Proceedings of the IEEE international conference on computer vision , pages=

Mask r-cnn , author=. Proceedings of the IEEE international conference on computer vision , pages=

work page
[14]

Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

You only look once: Unified, real-time object detection , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

work page
[15]

arXiv preprint arXiv:2305.09972 , year=

Real-time flying object detection with YOLOv8 , author=. arXiv preprint arXiv:2305.09972 , year=

work page arXiv
[16]

Yolov9: Learning what you want to learn us- ing programmable gradient information

Yolov9: Learning what you want to learn using programmable gradient information , author=. arXiv preprint arXiv:2402.13616 , year=

work page arXiv
[17]

Trees , volume=

A model of stem growth and wood formation in Pinus radiata , author=. Trees , volume=. 2015 , publisher=

work page 2015
[18]

2023 , publisher=

Smart Agriculture for Developing Nations , author=. 2023 , publisher=

work page 2023
[19]

2018 IEEE 22nd International Conference on Intelligent Engineering Systems (INES) , pages=

Survey of drones for agriculture automation from planting to harvest , author=. 2018 IEEE 22nd International Conference on Intelligent Engineering Systems (INES) , pages=. 2018 , organization=

work page 2018
[20]

International journal of remote sensing , volume=

Forestry applications of UAVs in Europe: A review , author=. International journal of remote sensing , volume=. 2017 , publisher=

work page 2017
[21]

IEEE transactions on pattern analysis and machine intelligence , volume=

A survey on deep learning techniques for stereo-based depth estimation , author=. IEEE transactions on pattern analysis and machine intelligence , volume=. 2020 , publisher=

work page 2020
[22]

Proceedings of the IEEE , volume=

Object detection in 20 years: A survey , author=. Proceedings of the IEEE , volume=. 2023 , publisher=

work page 2023
[23]

IEEE transactions on pattern analysis and machine intelligence , volume=

Image segmentation using deep learning: A survey , author=. IEEE transactions on pattern analysis and machine intelligence , volume=. 2021 , publisher=

work page 2021
[24]

SN Computer Science , volume=

A survey on object instance segmentation , author=. SN Computer Science , volume=. 2022 , publisher=

work page 2022
[25]

2017 4th International Conference on Advanced Computing and Communication Systems (ICACCS) , pages=

Robotic arm design, development and control for agriculture applications , author=. 2017 4th International Conference on Advanced Computing and Communication Systems (ICACCS) , pages=. 2017 , organization=

work page 2017
[26]

Vehicles , volume=

Deep learning-based stereopsis and monocular depth estimation techniques: a review , author=. Vehicles , volume=. 2024 , publisher=

work page 2024
[27]

Computer Vision--ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13 , pages=

Microsoft coco: Common objects in context , author=. Computer Vision--ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13 , pages=. 2014 , organization=

work page 2014
[28]

IEEE transactions on circuits and systems for video technology , volume=

A new three-step search algorithm for block motion estimation , author=. IEEE transactions on circuits and systems for video technology , volume=. 1994 , publisher=

work page 1994
[29]

IEEE transactions on circuits and systems for video technology , volume=

A novel four-step search algorithm for fast block motion estimation , author=. IEEE transactions on circuits and systems for video technology , volume=. 1996 , publisher=

work page 1996
[30]

International journal of computer vision , volume=

A taxonomy and evaluation of dense two-frame stereo correspondence algorithms , author=. International journal of computer vision , volume=. 2002 , publisher=

work page 2002
[31]

IEEE Transactions on pattern analysis and machine intelligence , volume=

Stereo processing by semiglobal matching and mutual information , author=. IEEE Transactions on pattern analysis and machine intelligence , volume=. 2007 , publisher=

work page 2007
[32]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Attention concatenation volume for accurate and efficient stereo matching , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

work page
[33]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Group-wise correlation stereo network , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

work page
[34]

Proceedings of the ieee/cvf winter conference on applications of computer vision , pages=

Mobilestereonet: Towards lightweight deep networks for stereo matching , author=. Proceedings of the ieee/cvf winter conference on applications of computer vision , pages=

work page
[35]

Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

Pyramid stereo matching network , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

work page
[36]

2021 International Conference on 3D Vision (3DV) , pages=

Raft-stereo: Multilevel recurrent field transforms for stereo matching , author=. 2021 International Conference on 3D Vision (3DV) , pages=. 2021 , organization=

work page 2021
[37]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Nerf-supervised deep stereo , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

work page
[38]

Neurocomputing , volume=

Deep learning for monocular depth estimation: A review , author=. Neurocomputing , volume=. 2021 , publisher=

work page 2021
[39]

IEEE transactions on pattern analysis and machine intelligence , volume=

Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer , author=. IEEE transactions on pattern analysis and machine intelligence , volume=. 2020 , publisher=

work page 2020
[40]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Depth anything: Unleashing the power of large-scale unlabeled data , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

work page
[41]

2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops , pages=

Accurate camera calibration using iterative refinement of control points , author=. 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops , pages=. 2009 , organization=

work page 2009
[42]

Robot Vision , pages=

Real-time stereo vision applications , author=. Robot Vision , pages=. 2010 , publisher=

work page 2010
[43]

2018 IEEE international conference on robotics and automation (ICRA) , pages=

Squeezeseg: Convolutional neural nets with recurrent crf for real-time road-object segmentation from 3d lidar point cloud , author=. 2018 IEEE international conference on robotics and automation (ICRA) , pages=. 2018 , organization=

work page 2018
[44]

Optics & Laser Technology , volume=

Overview of modulation techniques for spatially structured-light 3d imaging , author=. Optics & Laser Technology , volume=. 2024 , publisher=

work page 2024
[45]

Computer Graphics Forum , volume=

Time-of-flight cameras in computer graphics , author=. Computer Graphics Forum , volume=. 2010 , organization=

work page 2010
[46]

Journal of Visual Communication and Image Representation , volume=

Obtaining depth map from segment-based stereo matching using graph cuts , author=. Journal of Visual Communication and Image Representation , volume=. 2011 , publisher=

work page 2011
[47]

2003 , publisher=

Multiple view geometry in computer vision , author=. 2003 , publisher=

work page 2003
[48]

Advances in neural information processing systems , volume=

Depth map prediction from a single image using a multi-scale deep network , author=. Advances in neural information processing systems , volume=

work page
[49]

2012 IEEE conference on computer vision and pattern recognition , pages=

Are we ready for autonomous driving? the kitti vision benchmark suite , author=. 2012 IEEE conference on computer vision and pattern recognition , pages=. 2012 , organization=

work page 2012
[50]

Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

work page
[51]

1–a model zoo for robust monocular relative depth estimation

Midas v3. 1--a model zoo for robust monocular relative depth estimation , author=. arXiv preprint arXiv:2307.14460 , year=

work page arXiv
[52]

Understanding Deep Neural Networks with Rectified Linear Units

Understanding deep neural networks with rectified linear units , author=. arXiv preprint arXiv:1611.01491 , year=

work page Pith review arXiv
[53]

2019 IEEE International Electron Devices Meeting (IEDM) , pages=

High-density multiple bits-per-cell 1T4R RRAM array with gradual SET/RESET and its effectiveness for deep learning , author=. 2019 IEEE International Electron Devices Meeting (IEDM) , pages=. 2019 , organization=

work page 2019
[54]

arXiv preprint arXiv:2409.17526 , year=

Drone Stereo Vision for Radiata Pine Branch Detection and Distance Measurement: Integrating SGBM and Segmentation Models , author=. arXiv preprint arXiv:2409.17526 , year=

work page arXiv
[55]

arXiv e-prints , pages=

Drone Stereo Vision for Radiata Pine Branch Detection and Distance Measurement: Utilizing Deep Learning and YOLO Integration , author=. arXiv e-prints , pages=

work page