Recognition: unknown
Holo360D: A Large-Scale Real-World Dataset with Continuous Trajectories for Advancing Panoramic 3D Reconstruction and Beyond
Pith reviewed 2026-05-08 12:26 UTC · model grok-4.3
The pith
Holo360D supplies the first large-scale dataset of continuous panoramic sequences with aligned high-completeness depth maps.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Holo360D is the first large-scale real-world dataset that supplies continuous panoramic sequences paired with accurately aligned high-completeness depth maps, registered point clouds, meshes, and camera poses. Raw data are captured with a 3D laser scanner and 360 camera, refined through SLAM systems, and cleaned by a post-processing pipeline of geometry denoising, mesh hole filling, and region-specific remeshing. Fine-tuning 3D reconstruction models on the dataset yields superior training signals compared with prior discrete-location collections.
What carries the argument
The Holo360D dataset of continuous panoramic sequences with SLAM-aligned high-completeness depth maps produced by laser scanning and a tailored post-processing pipeline.
If this is right
- Panoramic feed-forward 3D reconstruction models gain stronger multi-view training signals from continuous trajectories.
- The dataset functions as a standardized benchmark for evaluating and advancing panoramic 3D reconstruction methods.
- Fine-tuned models exhibit improved handling of spherical distortions when trained on the aligned depth maps.
- Public release of the data and code supports further development of related panoramic vision applications.
Where Pith is reading between the lines
- The focus on trajectory continuity implies that similar capture and processing choices could improve datasets for other wide-field sensors.
- Better panoramic reconstruction models trained this way may directly aid indoor mapping and navigation systems that use consumer 360 cameras.
- The post-processing steps could be tested on other large-scale 3D capture projects to reduce artifacts in depth maps.
Load-bearing premise
The post-processing pipeline of geometry denoising, mesh hole filling, and region-specific remeshing combined with online and offline SLAM produces sufficiently accurate alignments and high-completeness depth maps without major artifacts or biases.
What would settle it
If models fine-tuned on Holo360D show no accuracy improvement or even worse results on panoramic 3D reconstruction tasks than models trained on existing discrete panoramic datasets, the claim of superior training signals would be falsified.
Figures
read the original abstract
While feed-forward 3D reconstruction models have advanced rapidly, they still exhibit degraded performance on panoramas due to spherical distortions. Moreover, existing panoramic 3D datasets are predominantly collected with 360 cameras fixed at discrete locations, resulting in discontinuous trajectories. These limitations critically hinder the development of panoramic feed-forward 3D reconstruction, especially for the multi-view setting. In this paper, we present Holo360D, a comprehensive dataset containing 109,495 panoramas paired with registered point clouds, meshes, and aligned camera poses. To our knowledge, Holo360D is the first large-scale dataset that provides continuous panoramic sequences with accurately aligned high-completeness depth maps. The raw data are initially collected using a 3D laser scanner coupled with a 360 camera. Subsequently, the raw data are processed with both online and offline SLAM systems. Furthermore, to enhance the 3D data quality, a post-processing pipeline tailored for the 360 dataset is proposed, including geometry denoising, mesh hole filling, and region-specific remeshing. Finally, we establish a new benchmark by fine-tuning 3D reconstruction models on Holo360D, providing key insights into effective fine-tuning strategies. Our results demonstrate that Holo360D delivers superior training signals and provides a comprehensive benchmark for advancing panoramic 3D reconstruction models. Datasets and Code will be made publicly available.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents Holo360D, a dataset of 109,495 continuous panoramic sequences paired with registered point clouds, meshes, and camera poses. Raw data are captured with a 3D laser scanner and 360° camera, then processed via online and offline SLAM followed by a post-processing pipeline (geometry denoising, mesh hole filling, region-specific remeshing). The authors fine-tune existing 3D reconstruction models on the dataset to create a benchmark and claim that Holo360D supplies superior training signals for panoramic feed-forward reconstruction.
Significance. A high-quality, large-scale continuous-trajectory panoramic dataset with aligned depth would address a clear gap in multi-view 3D reconstruction research, where existing datasets are limited to discrete viewpoints. Public release of data and code is a concrete strength that could enable reproducible progress on spherical-distortion handling.
major comments (2)
- [Abstract and §4] Abstract and §4 (Benchmark): the central claim that Holo360D 'delivers superior training signals' rests on fine-tuning results, yet no quantitative metrics (RMSE, completeness percentages, alignment error distributions, or controlled ablations against prior panoramic datasets) are reported. This absence directly undermines verification of the 'accurately aligned high-completeness depth maps' assertion.
- [§3.3] §3.3 (Post-processing pipeline): the description of geometry denoising, hole filling, and region-specific remeshing contains no before/after quantitative validation or error analysis. Given that the pipeline is load-bearing for the 'high-completeness' and 'artifact-free' properties, the lack of such evidence leaves the weakest assumption untested.
minor comments (2)
- [Abstract] The abstract states 'Datasets and Code will be made publicly available' without a specific URL or repository link; this should be added for reproducibility.
- [§2] Notation for 'continuous trajectories' versus 'discontinuous' baselines could be clarified with a short diagram or table comparing trajectory properties across datasets.
Simulated Author's Rebuttal
We thank the referee for the constructive comments and for recognizing the potential value of Holo360D in addressing gaps in panoramic 3D reconstruction. We address each major comment below and will revise the manuscript accordingly to provide stronger quantitative support.
read point-by-point responses
-
Referee: [Abstract and §4] Abstract and §4 (Benchmark): the central claim that Holo360D 'delivers superior training signals' rests on fine-tuning results, yet no quantitative metrics (RMSE, completeness percentages, alignment error distributions, or controlled ablations against prior panoramic datasets) are reported. This absence directly undermines verification of the 'accurately aligned high-completeness depth maps' assertion.
Authors: We agree that the current version of the manuscript does not include the requested quantitative metrics or ablations in §4. In the revised manuscript we will expand the benchmark section to report RMSE, completeness percentages, alignment error distributions, and controlled comparisons against prior panoramic datasets. These additions will directly substantiate the claim of superior training signals and allow verification of the alignment and completeness properties. revision: yes
-
Referee: [§3.3] §3.3 (Post-processing pipeline): the description of geometry denoising, hole filling, and region-specific remeshing contains no before/after quantitative validation or error analysis. Given that the pipeline is load-bearing for the 'high-completeness' and 'artifact-free' properties, the lack of such evidence leaves the weakest assumption untested.
Authors: We concur that quantitative before-and-after validation is required for the post-processing pipeline. We will augment §3.3 with error analyses and metrics quantifying the effects of geometry denoising, hole-filling success rates, and region-specific remeshing accuracy. These additions will provide concrete evidence supporting the high-completeness and artifact-free characteristics of the final data. revision: yes
Circularity Check
No significant circularity in derivation chain
full rationale
The paper presents an empirical dataset construction effort using laser scanning, 360 cameras, online/offline SLAM, and a post-processing pipeline of denoising, hole filling, and remeshing. No equations, parameter fittings, or mathematical derivations are described that could reduce to self-defined inputs or fitted quantities by construction. Claims of being the 'first large-scale dataset' with continuous trajectories and high-completeness depth maps rest on the described data collection process rather than any self-referential logic, self-citation chains, or renamed known results. The contribution is data release and benchmarking, with no load-bearing steps that collapse into their own inputs.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Online and offline SLAM systems can produce accurate camera pose alignment between 360 images and laser-scanned point clouds in real-world environments.
- domain assumption The proposed post-processing steps (denoising, hole filling, region-specific remeshing) improve 3D data quality without introducing new errors that affect downstream model training.
Reference graph
Works this paper leans on
-
[1]
A survey of represen- tation learning, optimization strategies, and applications for omnidirectional vision.International Journal of Computer Vision, pages 1–40, 2025
Hao Ai, Zidong Cao, and Lin Wang. A survey of represen- tation learning, optimization strategies, and applications for omnidirectional vision.International Journal of Computer Vision, pages 1–40, 2025. 2
2025
-
[2]
Pano3d: A holistic benchmark and a solid baseline for 360deg depth estimation
Georgios Albanis, Nikolaos Zioulis, Petros Drakoulis, Vasileios Gkitsas, Vladimiros Sterzentsenko, Federico Al- varez, Dimitrios Zarpalas, and Petros Daras. Pano3d: A holistic benchmark and a solid baseline for 360deg depth estimation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3727– 3737, 2021. 2
2021
-
[3]
Mapillary planet-scale depth dataset
Manuel L ´opez Antequera, Pau Gargallo, Markus Hofinger, Samuel Rota Bulo, Yubin Kuang, and Peter Kontschieder. Mapillary planet-scale depth dataset. InEuropean Confer- ence on Computer Vision, pages 589–604. Springer, 2020. 4
2020
-
[4]
Joint 2D-3D-Semantic Data for Indoor Scene Understanding
Iro Armeni, Sasha Sax, Amir R Zamir, and Silvio Savarese. Joint 2d-3d-semantic data for indoor scene understanding. arXiv preprint arXiv:1702.01105, 2017. 2, 3
work page Pith review arXiv 2017
-
[5]
Neural rgb-d surface reconstruction
Dejan Azinovi ´c, Ricardo Martin-Brualla, Dan B Goldman, Matthias Nießner, and Justus Thies. Neural rgb-d surface reconstruction. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6290– 6301, 2022. 8
2022
-
[6]
St2360d: Spatial-to-temporal consis- tency for training-free 360 monocular depth estimation
Zidong Cao, Jinjing Zhu, Hao Ai, Lutao Jiang, Yuanhuiyi Lyu, and Hui Xiong. St2360d: Spatial-to-temporal consis- tency for training-free 360 monocular depth estimation. In 10 The Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025. 2
2025
-
[7]
Panda: To- wards panoramic depth anything with unlabeled panoramas and mobius spatial augmentation
Zidong Cao, Jinjing Zhu, Weiming Zhang, Hao Ai, Hao- tian Bai, Hengshuang Zhao, and Lin Wang. Panda: To- wards panoramic depth anything with unlabeled panoramas and mobius spatial augmentation. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 982–992, 2025. 2
2025
-
[8]
Matterport3D: Learning from RGB-D Data in Indoor Environments
Angel Chang, Angela Dai, Thomas Funkhouser, Maciej Halber, Matthias Niessner, Manolis Savva, Shuran Song, Andy Zeng, and Yinda Zhang. Matterport3d: Learning from rgb-d data in indoor environments.arXiv preprint arXiv:1709.06158, 2017. 2, 3, 9
work page Pith review arXiv 2017
-
[9]
Panogrf: Generalizable spherical radiance fields for wide-baseline panoramas.Ad- vances in Neural Information Processing Systems, 36:6961– 6985, 2023
Zheng Chen, Yan-Pei Cao, Yuan-Chen Guo, Chen Wang, Ying Shan, and Song-Hai Zhang. Panogrf: Generalizable spherical radiance fields for wide-baseline panoramas.Ad- vances in Neural Information Processing Systems, 36:6961– 6985, 2023. 2
2023
-
[10]
Splatter-360: Generalizable 360 gaussian splatting for wide- baseline panoramic images
Zheng Chen, Chenming Wu, Zhelun Shen, Chen Zhao, We- icai Ye, Haocheng Feng, Errui Ding, and Song-Hai Zhang. Splatter-360: Generalizable 360 gaussian splatting for wide- baseline panoramic images. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 21590– 21599, 2025. 2
2025
-
[11]
Hsfm: Hybrid structure-from-motion
Hainan Cui, Xiang Gao, Shuhan Shen, and Zhanyi Hu. Hsfm: Hybrid structure-from-motion. InProceedings of the IEEE conference on computer vision and pattern recogni- tion, pages 1212–1221, 2017. 3
2017
-
[12]
Qi Feng, Hubert P. H. Shum, and Shigeo Morishima. 360 depth estimation in the wild - the depth360 dataset and the segfuse network.2022 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), pages 664–673, 2022. 3
2022
-
[13]
Cambridge university press,
Richard Hartley and Andrew Zisserman.Multiple view ge- ometry in computer vision. Cambridge university press,
-
[14]
360loc: A dataset and benchmark for omnidirectional visual localization with cross-device queries
Huajian Huang, Changkun Liu, Yipeng Zhu, Hui Cheng, Tristan Braud, and Sai-Kit Yeung. 360loc: A dataset and benchmark for omnidirectional visual localization with cross-device queries. InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, pages 22314–22324, 2024. 2, 3
2024
-
[15]
Deepmvs: Learning multi- view stereopsis
Po-Han Huang, Kevin Matzen, Johannes Kopf, Narendra Ahuja, and Jia-Bin Huang. Deepmvs: Learning multi- view stereopsis. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 2821–2830,
-
[16]
Dongki Jung, Jaehoon Choi, Yonghan Lee, and Dinesh Manocha. Im360: Textured mesh reconstruction for large- scale indoor mapping with 360 cameras.arXiv preprint arXiv:2502.12545, 2025. 3
-
[17]
Ground- ing image matching in 3d with mast3r
Vincent Leroy, Yohann Cabon, and J´erˆome Revaud. Ground- ing image matching in 3d with mast3r. InEuropean Confer- ence on Computer Vision, pages 71–91. Springer, 2024. 2, 3
2024
-
[18]
Mode: Multi-view omnidirectional depth esti- mation with 360 cameras
Ming Li, Xueqian Jin, Xuejiao Hu, Jingzhao Dai, Sidan Du, and Yang Li. Mode: Multi-view omnidirectional depth esti- mation with 360 cameras. InEuropean Conference on Com- puter Vision, pages 197–213. Springer, 2022. 2, 3
2022
-
[19]
Megadepth: Learning single- view depth prediction from internet photos
Zhengqi Li and Noah Snavely. Megadepth: Learning single- view depth prediction from internet photos. InProceed- ings of the IEEE conference on computer vision and pattern recognition, pages 2041–2050, 2018. 4
2041
-
[20]
Kitti-360: A novel dataset and benchmarks for urban scene understanding in 2d and 3d.IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(3):3292–3310, 2022
Yiyi Liao, Jun Xie, and Andreas Geiger. Kitti-360: A novel dataset and benchmarks for urban scene understanding in 2d and 3d.IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(3):3292–3310, 2022. 2, 3
2022
-
[21]
Dl3dv-10k: A large-scale scene dataset for deep learning-based 3d vision
Lu Ling, Yichen Sheng, Zhi Tu, Wentian Zhao, Cheng Xin, Kun Wan, Lantao Yu, Qianyu Guo, Zixun Yu, Yawen Lu, et al. Dl3dv-10k: A large-scale scene dataset for deep learning-based 3d vision. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 22160–22169, 2024. 4
2024
-
[22]
Slam3r: Real- time dense scene reconstruction from monocular rgb videos
Yuzheng Liu, Siyan Dong, Shuzhe Wang, Yingda Yin, Yan- chao Yang, Qingnan Fan, and Baoquan Chen. Slam3r: Real- time dense scene reconstruction from monocular rgb videos. InProceedings of the Computer Vision and Pattern Recogni- tion Conference, pages 16651–16662, 2025. 4
2025
-
[23]
Aria digital twin: A new benchmark dataset for egocentric 3d machine perception
Xiaqing Pan, Nicholas Charron, Yongqian Yang, Scott Pe- ters, Thomas Whelan, Chen Kong, Omkar Parkhi, Richard Newcombe, and Yuheng Carl Ren. Aria digital twin: A new benchmark dataset for egocentric 3d machine perception. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 20133–20143, 2023. 4
2023
-
[24]
Com- mon objects in 3d: Large-scale learning and evaluation of real-life 3d category reconstruction
Jeremy Reizenstein, Roman Shapovalov, Philipp Henzler, Luca Sbordone, Patrick Labatut, and David Novotny. Com- mon objects in 3d: Large-scale learning and evaluation of real-life 3d category reconstruction. InProceedings of the IEEE/CVF international conference on computer vision, pages 10901–10911, 2021. 4
2021
-
[25]
Yoloe: Real-time seeing anything.arXiv preprint arXiv:2503.07465, 2025
Ao Wang, Lihao Liu, Hui Chen, Zijia Lin, Jungong Han, and Guiguang Ding. Yoloe: Real-time seeing anything.arXiv preprint arXiv:2503.07465, 2025. 6
-
[26]
arXiv preprint arXiv:2408.16061 (2024)
Hengyi Wang and Lourdes Agapito. 3d reconstruction with spatial memory.arXiv preprint arXiv:2408.16061, 2024. 4, 8
-
[27]
Posediffusion: Solving pose estimation via diffusion-aided bundle adjustment
Jianyuan Wang, Christian Rupprecht, and David Novotny. Posediffusion: Solving pose estimation via diffusion-aided bundle adjustment. InProceedings of the IEEE/CVF Inter- national Conference on Computer Vision, pages 9773–9783,
-
[28]
Vggt: Vi- sual geometry grounded transformer
Jianyuan Wang, Minghao Chen, Nikita Karaev, Andrea Vedaldi, Christian Rupprecht, and David Novotny. Vggt: Vi- sual geometry grounded transformer. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 5294–5306, 2025. 2, 4, 8
2025
-
[29]
Depth any- where: Enhancing 360 monocular depth estimation via perspective distillation and unlabeled data augmentation
Ning-Hsu Albert Wang and Yu-Lun Liu. Depth any- where: Enhancing 360 monocular depth estimation via perspective distillation and unlabeled data augmentation. Advances in Neural Information Processing Systems, 37: 127739–127764, 2024. 2
2024
-
[30]
Continuous 3d per- ception model with persistent state
Qianqian Wang, Yifei Zhang, Aleksander Holynski, Alexei A Efros, and Angjoo Kanazawa. Continuous 3d per- ception model with persistent state. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 10510–10522, 2025. 4, 8 11
2025
-
[31]
Moge: Unlocking accurate monocular geometry estimation for open-domain images with optimal training supervision
Ruicheng Wang, Sicheng Xu, Cassie Dai, Jianfeng Xiang, Yu Deng, Xin Tong, and Jiaolong Yang. Moge: Unlocking accurate monocular geometry estimation for open-domain images with optimal training supervision. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 5261–5271, 2025. 2
2025
-
[32]
Dust3r: Geometric 3d vi- sion made easy
Shuzhe Wang, Vincent Leroy, Yohann Cabon, Boris Chidlovskii, and Jerome Revaud. Dust3r: Geometric 3d vi- sion made easy. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20697– 20709, 2024. 2, 3, 8
2024
-
[33]
$\pi^3$: Permutation-Equivariant Visual Geometry Learning
Yifan Wang, Jianjun Zhou, Haoyi Zhu, Wenzheng Chang, Yang Zhou, Zizun Li, Junyi Chen, Jiangmiao Pang, Chunhua Shen, and Tong He. Permutation-equivariant visual geome- try learning.arXiv preprint arXiv:2507.13347, 2025. 2, 4, 8
work page internal anchor Pith review arXiv 2025
-
[34]
Fast3r: Towards 3d reconstruction of 1000+ images in one forward pass
Jianing Yang, Alexander Sax, Kevin J Liang, Mikael Henaff, Hao Tang, Ang Cao, Joyce Chai, Franziska Meier, and Matt Feiszli. Fast3r: Towards 3d reconstruction of 1000+ images in one forward pass. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 21924–21935,
-
[35]
Helvipad: A real-world dataset for om- nidirectional stereo depth estimation
Mehdi Zayene, Jannik Endres, Albias Havolli, Charles Corbi`ere, Salim Cherkaoui, Alexandre Kontouli, and Alexandre Alahi. Helvipad: A real-world dataset for om- nidirectional stereo depth estimation. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 26975–26984, 2025. 7
2025
-
[36]
Pansplat: 4k panorama synthesis with feed-forward gaussian splatting
Cheng Zhang, Haofei Xu, Qianyi Wu, Camilo Cruz Gam- bardella, Dinh Phung, and Jianfei Cai. Pansplat: 4k panorama synthesis with feed-forward gaussian splatting. In Proceedings of the Computer Vision and Pattern Recognition Conference, pages 11437–11447, 2025. 2
2025
-
[37]
arXiv preprint arXiv:2410.03825 (2024)
Junyi Zhang, Charles Herrmann, Junhwa Hur, Varun Jam- pani, Trevor Darrell, Forrester Cole, Deqing Sun, and Ming- Hsuan Yang. Monst3r: A simple approach for estimat- ing geometry in the presence of motion.arXiv preprint arXiv:2410.03825, 2024. 2, 8
-
[38]
Flare: Feed-forward geometry, appearance and camera estimation from uncalibrated sparse views
Shangzhan Zhang, Jianyuan Wang, Yinghao Xu, Nan Xue, Christian Rupprecht, Xiaowei Zhou, Yujun Shen, and Gor- don Wetzstein. Flare: Feed-forward geometry, appearance and camera estimation from uncalibrated sparse views. In Proceedings of the Computer Vision and Pattern Recognition Conference, pages 21936–21947, 2025. 2
2025
-
[39]
Particlesfm: Exploiting dense point trajecto- ries for localizing moving cameras in the wild
Wang Zhao, Shaohui Liu, Hengkai Guo, Wenping Wang, and Yong-Jin Liu. Particlesfm: Exploiting dense point trajecto- ries for localizing moving cameras in the wild. InEuropean Conference on Computer Vision, pages 523–542. Springer,
-
[40]
Structured3d: A large photo-realistic dataset for structured 3d modeling
Jia Zheng, Junfei Zhang, Jing Li, Rui Tang, Shenghua Gao, and Zihan Zhou. Structured3d: A large photo-realistic dataset for structured 3d modeling. InComputer Vision– ECCV 2020: 16th European Conference, Glasgow, UK, Au- gust 23–28, 2020, Proceedings, Part IX 16, pages 519–535. Springer, 2020. 2, 3
2020
-
[41]
Omnidepth: Dense depth estimation for indoors spherical panoramas
Nikolaos Zioulis, Antonis Karakottas, Dimitrios Zarpalas, and Petros Daras. Omnidepth: Dense depth estimation for indoors spherical panoramas. InProceedings of the Euro- pean Conference on Computer Vision (ECCV), pages 448– 465, 2018. 2 12 Supplementary Material
2018
-
[42]
The LiDAR offers a 360° × 270° (Horizontal × Vertical) field of view, with a sensing range from 0.05 m to 120 m
Equipment Details The data acquisition device integrates a LiDAR, RTK- GNSS, IMU, three pinhole cameras, and a 360° camera. The LiDAR offers a 360° × 270° (Horizontal × Vertical) field of view, with a sensing range from 0.05 m to 120 m. It captures point clouds at 320,000 points per second, achiev- ing an absolute precision of 5 cm and a relative precisio...
-
[43]
Challenging Scenes As shown in Fig
Dataset Details 7.1. Challenging Scenes As shown in Fig. 13, our dataset includes several challeng- ing scenes, including (a) low-texture and repetitive-texture scenes, (b) large, long-sequence scenes, as well as (c) low- light and overexposed scenes. These challenging environ- ments provide a robust basis for thoroughly evaluating the performance of pano...
-
[44]
We samplenpanoramic images (n∈[3,6]) from a ran- domly selected window of a sequence and decompose each panorama into eight perspective views
Experiment Settings During training, we adopt a dynamic batch size following π3. We samplenpanoramic images (n∈[3,6]) from a ran- domly selected window of a sequence and decompose each panorama into eight perspective views. Thus, each training batch contains 24–48 perspective images, with at most 48 images processed on each GPU. We train each model us- in...
-
[45]
4.3 of the main paper, all models show improved quantitative and qualitative performance af- ter finetuning on our dataset
More Results As discussed in Sec. 4.3 of the main paper, all models show improved quantitative and qualitative performance af- ter finetuning on our dataset. To complement these findings, we further assess the qualitative performance of the fine- tunedπ 3 model under diverse evaluation settings, including sparse-view and single-view panoramic reconstructi...
-
[46]
As shown in Fig
and PanDA [2]). As shown in Fig. 17, we observe that the finetunedπ 3 approach achieves higher geometric accu- racy than other depth estimation methods. We attribute this improvement to two main factors. First, our dataset pro- vides outdoor and long-range indoor scenes, enabling fine- tunedπ 3 to generalize better to such environments. Sec- ond, although...
-
[47]
As shown in Fig
Limitations and Future Work Limitations.Although our dataset surpasses existing ones in both scale and quality and significantly improves the per- formance of fine-tuned models, one limitation remains: re- construction quality degrades in distant regions. As shown in Fig. 18, while the finetuned model performs well in near regions, the geometric accuracy ...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.