pith. sign in

arxiv: 2606.20455 · v1 · pith:373F4LETnew · submitted 2026-06-18 · 💻 cs.CV

PCFootprint: A Large-Scale Dataset and Benchmark for Vectorized Building Footprint Extraction from Aerial LiDAR Point Clouds

Pith reviewed 2026-06-26 18:04 UTC · model grok-4.3

classification 💻 cs.CV
keywords building footprint extractionLiDAR point cloudsvectorized footprintsaerial laser scanningdatasetbenchmarkremote sensingurban modeling
0
0 comments X

The pith

PCFootprint supplies the first large-scale public set of 33,000 aligned tiles for extracting vectorized building footprints directly from airborne LiDAR point clouds.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces PCFootprint as the first large-scale public dataset for vectorized building footprint extraction from aerial laser scanning point clouds rather than optical images. It consists of 33,000 tiles of 128 by 128 meters drawn from Estonian data, each paired with systematically aligned vectorized footprints, plus a separate 3,000-tile cross-domain test set. The authors benchmark existing methods on this data and document persistent difficulties such as high intra-class variance, imbalance, and noise in complex urban and rural scenes. A sympathetic reader would care because optical imagery is limited by occlusions, perspective distortion, and missing elevation, whereas point clouds supply direct height information needed for accurate Level of Detail building models. If the dataset holds, it supplies a common testbed that can drive progress in building modeling, urban scene understanding, and geospatial analysis.

Core claim

We present PCFootprint, the first large-scale public dataset for footprint extraction from airborne laser scanning point clouds, comprising 33,000 tiles with systematically aligned vectorized footprints. Each tile spans 128 by 128 meters and covers diverse urban and rural landscapes; a 3,000-tile cross-domain test set supports evaluation of geographic generalization. Comprehensive benchmarks of mainstream methods reveal significant remaining challenges including high intra-class variance, data imbalance, and noise.

What carries the argument

The PCFootprint dataset of 33,000 128-by-128-meter tiles containing point clouds paired with systematically aligned vectorized footprints.

If this is right

  • Methods can now be trained and evaluated using explicit elevation data rather than inferred from imagery alone.
  • The cross-domain test set provides a concrete measure of geographic generalization for any new footprint extraction algorithm.
  • Benchmark results establish baseline performance levels that future point-cloud-specific models must surpass.
  • The dataset directly supports downstream tasks that require aligned 2D footprints together with 3D height information for Level of Detail modeling.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The dataset could be extended by adding per-building semantic labels or multi-temporal scans to support change detection.
  • Point-cloud methods developed on PCFootprint may transfer to other airborne laser scanning collections once alignment protocols are standardized.
  • Combining the footprints with image-based cues could produce hybrid systems that correct residual alignment errors.

Load-bearing premise

The vectorized footprints are accurately and systematically aligned to the point clouds and the chosen tiles plus cross-domain test set represent a sufficiently broad range of urban and rural conditions to serve as a general benchmark.

What would settle it

Manual verification on a random sample of tiles shows systematic misalignment between the supplied footprints and the underlying point clouds, or a method trained on the training tiles fails to generalize on the 3,000-tile cross-domain test set.

Figures

Figures reproduced from arXiv: 2606.20455 by Haoyuan Shen, Kuihao Wang, Ruisheng Wang, Yujun Liu.

Figure 1
Figure 1. Figure 1: Geographic sampling strategy and illustrative visualization of the PCFootprint dataset. (a) Nationwide building density [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 3
Figure 3. Figure 3: Conceptual workflow of the building height estimation. [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 2
Figure 2. Figure 2: Comparison of point cloud densities (pts/m2 ) across different counties in Estonia. Each violin plot shows the distri￾bution of point cloud density, revealing the spread and concen￾tration of points over building areas within each region. The multi-modal nature of these distributions indicates significant variations in LiDAR scanning patterns. Multi-scale Characteristics. The structural complexity of PCFoo… view at source ↗
Figure 4
Figure 4. Figure 4: Statistical distribution of building heights across Es [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Regional distribution of building scales across Esto [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗
Figure 8
Figure 8. Figure 8: Proportional distribution and subset splitting of point [PITH_FULL_IMAGE:figures/full_fig_p008_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Qualitative comparison of building footprint extraction results across various methodologies. Representative scenarios [PITH_FULL_IMAGE:figures/full_fig_p010_9.png] view at source ↗
read the original abstract

Building footprint extraction is a fundamental task in photogrammetry, remote sensing, and computer vision. Recent image-based methods have achieved remarkable progress in extracting vectorized footprints from high-resolution optical imagery. However, optical imagery inherently susceptible to occlusions, perspective distortions, and residual relief displacement, yielding incomplete or misaligned footprint extraction. Furthermore, the lack of explicit elevation information limits its direct applicability to Level of Detail building modeling. In this paper, we present PCFootprint, the first large-scale public dataset for footprint extraction from airborne laser scanning point clouds. PCFootprint comprises \num{33000} tiles derived from the Estonian Land and Spatial Development Board, covering diverse urban and rural landscapes. Each tile spans \qtyproduct{128 x 128}{\m} with systematically aligned vectorized footprints aligned to point clouds. The dataset includes a \num{3000} tiles cross-domain test set for evaluating generalization across geographic regions. We establish comprehensive benchmarks by evaluating mainstream methods. Experimental results reveal significant challenges including high intra-class variance, data imbalance, and noise across complex geospatial environments. We believe PCFootprint will advance future research in building modeling, urban scene understanding, and geospatial analysis. The PCFootprint dataset is publicly available at \url{https://huggingface.co/datasets/Haoyuan-Shen/PCFootprint}.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper introduces PCFootprint as the first large-scale public dataset for vectorized building footprint extraction from airborne laser scanning (ALS) point clouds. It comprises 33,000 tiles of 128 m × 128 m point clouds from the Estonian Land and Spatial Development Board, each with systematically aligned vectorized footprints, plus a 3,000-tile cross-domain test set intended to evaluate generalization across geographic regions. The work benchmarks mainstream methods on the dataset and reports challenges such as high intra-class variance, data imbalance, and noise in complex environments. The dataset is released publicly via Hugging Face.

Significance. If the alignment between point clouds and footprints is rigorously validated and the test set provides genuine geographic diversity beyond a single national source, this would constitute a valuable public benchmark resource. It directly addresses limitations of optical imagery (occlusions, lack of elevation) for Level-of-Detail building modeling and supplies the first large-scale ALS-specific resource with reproducible code and data access, enabling systematic progress in urban scene understanding and geospatial analysis.

major comments (2)
  1. [Abstract] Abstract: the manuscript asserts that the 3,000-tile cross-domain test set supports evaluation of 'generalization across geographic regions' and positions the dataset as a 'general benchmark,' yet all tiles originate from a single national provider (Estonian Land and Spatial Development Board). No spatial separation metrics, municipality-level metadata, or evidence of multi-country coverage are supplied, so the test set may capture only intra-national variation rather than the broader generalization claimed; this directly affects the central benchmark utility assertion.
  2. [Abstract] Abstract: the claim of 'systematically aligned vectorized footprints aligned to point clouds' is load-bearing for dataset usability, but the manuscript supplies no description of the alignment procedure, validation metrics (e.g., IoU thresholds, manual review statistics), or error rates. Without these, downstream benchmark results cannot be confidently interpreted as reflecting true point-cloud-to-footprint correspondence.
minor comments (1)
  1. [Abstract] Abstract: the phrasing 'systematically aligned vectorized footprints aligned to point clouds' is redundant; a single concise statement of alignment would improve clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive comments, which help clarify the scope and presentation of our work. We address each major comment point by point below and indicate where revisions will be made.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the manuscript asserts that the 3,000-tile cross-domain test set supports evaluation of 'generalization across geographic regions' and positions the dataset as a 'general benchmark,' yet all tiles originate from a single national provider (Estonian Land and Spatial Development Board). No spatial separation metrics, municipality-level metadata, or evidence of multi-country coverage are supplied, so the test set may capture only intra-national variation rather than the broader generalization claimed; this directly affects the central benchmark utility assertion.

    Authors: We agree that the current wording risks overstating the geographic scope. All tiles, including the cross-domain test set, originate from the Estonian Land and Spatial Development Board and therefore represent intra-national variation across diverse urban, suburban, and rural landscapes within Estonia. The term 'cross-domain' was intended to denote separation by geographic sub-regions rather than international boundaries. In the revised manuscript we will explicitly qualify the claim to 'generalization across diverse geographic sub-regions within Estonia' and will add a brief discussion of this limitation in the dataset description section. We will also include any available municipality-level or spatial-separation statistics that can be derived from the source data. revision: partial

  2. Referee: [Abstract] Abstract: the claim of 'systematically aligned vectorized footprints aligned to point clouds' is load-bearing for dataset usability, but the manuscript supplies no description of the alignment procedure, validation metrics (e.g., IoU thresholds, manual review statistics), or error rates. Without these, downstream benchmark results cannot be confidently interpreted as reflecting true point-cloud-to-footprint correspondence.

    Authors: We acknowledge that the alignment procedure and its validation are insufficiently documented. The revised manuscript will include a new subsection detailing the alignment workflow (including any automated registration steps and subsequent manual verification), the quantitative validation metrics employed (such as IoU thresholds and overlap statistics), the scale of manual review performed, and any observed error rates or residual misalignment statistics. This addition will allow readers to assess the reliability of the point-cloud-to-footprint correspondence used in the benchmarks. revision: yes

Circularity Check

0 steps flagged

No significant circularity: dataset release with no derivations

full rationale

This is a data-release paper whose central claim is the creation and public availability of PCFootprint (33000 tiles + 3000-tile test set) with aligned vectorized footprints. No equations, fitted parameters, predictions, or derivation chains exist in the abstract or described content. Benchmarks simply evaluate existing mainstream methods on the released data. The representativeness of the Estonian-sourced test set is an empirical assumption about geographic coverage, not a circular reduction of any result to its own inputs. No self-citation load-bearing steps, ansatzes, or renamings are present. The paper is self-contained as a benchmark contribution.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a dataset release and benchmark paper; the central contribution is new data rather than any derivation, so no free parameters, axioms, or invented entities are involved.

pith-pipeline@v0.9.1-grok · 5778 in / 1145 out tokens · 33670 ms · 2026-06-26T18:04:19.461069+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

63 extracted references · 6 canonical work pages · 5 internal anchors

  1. [1]

    Mei, and Haifeng Li

    Qing Zhu, Cheng Liao, Han Hu, X. Mei, and Haifeng Li. Map-net: Multiple attending path neural network for building footprint extraction from remote sensed imagery. IEEE Transactions on Geoscience and Remote Sensing, 59:6169–6181, 2019. PCFootprint: A Large-ScaleDataset andBenchmark forVectorizedBuildingFootprintExtraction fromAerialLiDAR PointClouds12

  2. [2]

    Chanussot

    Yuxuan Li, Danfeng Hong, Chenyu Li, Jing Yao, and J. Chanussot. HD-Net: High-resolution decoupled net- work for building footprint extraction via deeply super- vised body and boundary decomposition.ISPRS Journal of Photogrammetry and Remote Sensing, 209:51–65, March 2024

  3. [3]

    Fu, and Le Yu

    Weijia Li, Conghui He, Jiarui Fang, Juepeng Zheng, H. Fu, and Le Yu. Semantic segmentation-based building foot- print extraction using very high-resolution satellite images and multi-source gis data.Remote. Sens., 11:403, 2019

  4. [4]

    Bittner, Fathalrahman Adam, S

    K. Bittner, Fathalrahman Adam, S. Cui, Marco Körner, and P. Reinartz. Building footprint extraction from vhr remote sensing images combined with normalized dsms using fused fully convolutional networks.IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 11:2615–2629, 2018

  5. [5]

    Toward automatic building footprint delineation from aerial images using cnn and regularization.IEEE Transactions on Geoscience and Remote Sensing, 58:2178–2189, 2020

    Shiqing Wei, Shunping Ji, and Meng Lu. Toward automatic building footprint delineation from aerial images using cnn and regularization.IEEE Transactions on Geoscience and Remote Sensing, 58:2178–2189, 2020

  6. [6]

    BuildMapper: A fully learnable framework for vectorized building contour extraction.ISPRS Jour- nal of Photogrammetry and Remote Sensing, 197:87–104, March 2023

    Shiqing Wei, Tao Zhang, Shunping Ji, Muying Luo, and Jianya Gong. BuildMapper: A fully learnable framework for vectorized building contour extraction.ISPRS Jour- nal of Photogrammetry and Remote Sensing, 197:87–104, March 2023

  7. [7]

    SAMPolyBuild: Adapting the Segment Anything Model for polygonal building ex- traction.ISPRS Journal of Photogrammetry and Remote Sensing, 218:707–720, December 2024

    Chenhao Wang, Jingbo Chen, Yu Meng, Yupeng Deng, Kai Li, and Yunlong Kong. SAMPolyBuild: Adapting the Segment Anything Model for polygonal building ex- traction.ISPRS Journal of Photogrammetry and Remote Sensing, 218:707–720, December 2024

  8. [8]

    P2pformer: A primitive-to- polygon method for regular building contour extraction from remote sensing images.IEEE Transactions on Geo- science and Remote Sensing, 2024

    Tao Zhang, Shiqing Wei, Yikang Zhou, Muying Luo, Wen- ling Yu, and Shunping Ji. P2pformer: A primitive-to- polygon method for regular building contour extraction from remote sensing images.IEEE Transactions on Geo- science and Remote Sensing, 2024

  9. [9]

    P2pformerv2: Improving primitive-based regular building contour extraction meth- ods via contour feature enhancement.IEEE Transactions on Geoscience and Remote Sensing, 2025

    Wenling Yu, Tao Zhang, Shunping Ji, Kun Zhang, Bo Liu, Hua Liu, and Jianya Gong. P2pformerv2: Improving primitive-based regular building contour extraction meth- ods via contour feature enhancement.IEEE Transactions on Geoscience and Remote Sensing, 2025

  10. [10]

    RoIPoly: Vectorized building outline extraction using vertex and logit embeddings.ISPRS Journal of Photogrammetry and Remote Sensing, 224:317–328, June 2025

    Weiqin Jiao, Hao Cheng, George V osselman, and Claudio Persello. RoIPoly: Vectorized building outline extraction using vertex and logit embeddings.ISPRS Journal of Photogrammetry and Remote Sensing, 224:317–328, June 2025

  11. [11]

    Pix2poly: A sequence prediction method for end-to-end polygonal building footprint ex- traction from remote sensing imagery

    Yeshwanth Kumar Adimoolam, Charalambos Poullis, and Melinos Averkiou. Pix2poly: A sequence prediction method for end-to-end polygonal building footprint ex- traction from remote sensing imagery. In2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 8484–8493. IEEE, 2025

  12. [12]

    Holitracer: Holistic vectorization of geographic objects from large-size remote sensing imagery

    Yu Wang, Bo Dang, Wanchun Li, Wei Chen, and Yansheng Li. Holitracer: Holistic vectorization of geographic objects from large-size remote sensing imagery. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 8482–8491, 2025

  13. [13]

    Shunping Ji, Shiqing Wei, and Meng Lu. Fully Convolu- tional Networks for Multisource Building Extraction From an Open Aerial and Satellite Imagery Data Set.IEEE Trans- actions on Geoscience and Remote Sensing, 57(1):574– 586, January 2019

  14. [14]

    Can semantic labeling meth- ods generalize to any city? the inria aerial image labeling benchmark

    Emmanuel Maggiori, Yuliya Tarabalka, Guillaume Charpiat, and Pierre Alliez. Can semantic labeling meth- ods generalize to any city? the inria aerial image labeling benchmark. In2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), pages 3226–3229, Fort Worth, TX, July 2017. IEEE

  15. [15]

    Deep learning for understanding satellite im- agery: An experimental survey.Frontiers in Artificial Intelligence, 3, 2020

    Sharada Prasanna Mohanty, Jakub Czakon, Kamil A Kacz- marek, Andrzej Pyskir, Piotr Tarasiewicz, Saket Kunwar, Janick Rohrbach, Dave Luo, Manjunath Prasad, Sascha Fleer, et al. Deep learning for understanding satellite im- agery: An experimental survey.Frontiers in Artificial Intelligence, 3, 2020

  16. [16]

    A coarse-to-fine boundary refinement network for building footprint extraction from remote sensing imagery.ISPRS Journal of Photogrammetry and Remote Sensing, 2022

    Haonan Guo, Bo Du, Liangpei Zhang, and Xin Su. A coarse-to-fine boundary refinement network for building footprint extraction from remote sensing imagery.ISPRS Journal of Photogrammetry and Remote Sensing, 2022

  17. [17]

    Enhancing building segmentation with shadow-aware edge percep- tion.IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 18:1–12, 2025

    Ying Yu, Chunping Wang, Ren-Jay Kou, Huiying Wang, Boxiong Yang, Jinhui Xu, and Qiang Fu. Enhancing building segmentation with shadow-aware edge percep- tion.IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 18:1–12, 2025

  18. [18]

    Dongjie Yang, Xianjun Gao, Yuanwei Yang, Kangliang Guo, Kuikui Han, and Lei Xu. Advances and future prospects in building extraction from high-resolution re- mote sensing images.IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 18:6994– 7016, 2025

  19. [19]

    Fusionheightnet: A multi-level cross-fusion method from multi-source remote sensing images for urban building height estimation.Remote

    Chao Ma, Yueting Zhang, Jiayi Guo, Guangyao Zhou, and Xiurui Geng. Fusionheightnet: A multi-level cross-fusion method from multi-source remote sensing images for urban building height estimation.Remote. Sens., 16:958, 2024

  20. [20]

    Wanqi Xu, Zhangyin Feng, Qian Wan, Yakun Xie, Dejun Feng, Jun Zhu, and Yangge Liu. Building height extraction from high-resolution single-view remote sensing images using shadow and side information.IEEE Journal of Se- lected Topics in Applied Earth Observations and Remote Sensing, 17:6514–6528, 2024

  21. [21]

    Khoshelham

    Dawen Yu, Shunping Ji, Shiqing Wei, and K. Khoshelham. 3-d building instance extraction from high-resolution re- mote sensing images and dsm with an end-to-end deep neural network.IEEE Transactions on Geoscience and Remote Sensing, 62:1–19, 2024

  22. [22]

    Tomljenovic, B

    I. Tomljenovic, B. Höfle, D. Tiede, and T. Blaschke. Build- ing extraction from airborne laser scanning data: An anal- ysis of the state of the art.Remote. Sens., 7:3826–3862, 2015

  23. [23]

    Zang, Wenhan Mi, Xiongwu Xiao, Haiyan Guan, Jike Chen, and Deren Li

    Y . Zang, Wenhan Mi, Xiongwu Xiao, Haiyan Guan, Jike Chen, and Deren Li. Compound 3d building modeling with structure-aware partition and primitive assembly from airborne laser scanning point clouds.International Journal of Digital Earth, 17, 2024

  24. [24]

    Tree extraction from airborne laser scanning data in urban areas

    Han You, Shihua Li, Yifan Xu, Ze He, and Di Wang. Tree extraction from airborne laser scanning data in urban areas. Remote. Sens., 13:3428, 2021. PCFootprint: A Large-ScaleDataset andBenchmark forVectorizedBuildingFootprintExtraction fromAerialLiDAR PointClouds13

  25. [25]

    E. M. Domínguez, Peter Brotzer, E. Casalini, and David Small. Mapping urban areas and infrastructure through fu- sion of airborne sar 3-d images: A comparative study with als sensors.IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 18:6164–6181, 2025

  26. [26]

    A geometry- attentional network for als point cloud classification.Isprs Journal of Photogrammetry and Remote Sensing, 164:26– 40, 2020

    Wuzhao Li, Fudong Wang, and Guisong Xia. A geometry- attentional network for als point cloud classification.Isprs Journal of Photogrammetry and Remote Sensing, 164:26– 40, 2020

  27. [27]

    Dales: A large-scale aerial lidar data set for semantic segmen- tation

    Nina Varney, Vijayan K Asari, and Quinn Graehling. Dales: A large-scale aerial lidar data set for semantic segmen- tation. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pages 186–187, 2020

  28. [28]

    WHU-Urban3D: An urban scene Li- DAR point cloud dataset for semantic instance segmen- tation.ISPRS Journal of Photogrammetry and Remote Sensing, 209:500–513, March 2024

    Xu Han, Chong Liu, Yuzhou Zhou, Kai Tan, Zhen Dong, and Bisheng Yang. WHU-Urban3D: An urban scene Li- DAR point cloud dataset for semantic instance segmen- tation.ISPRS Journal of Photogrammetry and Remote Sensing, 209:500–513, March 2024

  29. [29]

    Automatic Build- ing Footprint Extraction and Regularisation from LIDAR Point Cloud Data

    Mohammad Awrangjeb and Guojun Lu. Automatic Build- ing Footprint Extraction and Regularisation from LIDAR Point Cloud Data. In2014 International Conference on Digital Image Computing: Techniques and Applications (DICTA), pages 1–8, November 2014

  30. [30]

    Ahmad Gamal, Faris Zulkarnain, Satria Indratmoko, Ar- diansyah, Lailatul Rohmah, and Arini Mushfiroh. Semi- automatic Model for Detecting the Discrepancy Between Cadastral Data and Building Footprint Extraction using Unmanned Aerial Vehicles (UA V) LiDAR in the Perspec- tive of Smart City. In2023 International Conference on Technology, Engineering, and Co...

  31. [31]

    Jagadeesan Nalini, Subbarayan Saravanan, Bommineni Narender, and Swaminathan Muralikrishnan. Automatic Building Boundary Extraction from Point Cloud Data.IS- PRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, X-5/W2-2025:429–438, De- cember 2025

  32. [32]

    G. Kong, H. Fan, and G. Lobaccaro. Automatic building outline extraction from als point cloud data using genera- tive adversarial network.Geocarto International, 37:15964 – 15981, 2022

  33. [33]

    Kong and H

    G. Kong and H. Fan. Ph-shape: an adaptive persistent homology-based approach for building outline extraction from als point cloud data.Geo-spatial Information Science, 27:1107 – 1117, 2023

  34. [34]

    Nurunnabi, N

    A. Nurunnabi, N. Teferle, J. Balado, M. Chen, Florent Poux, and C. Sun. Robust techniques for building footprint extraction in aerial laser scanning 3d point clouds.The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 2022

  35. [35]

    B. P. Hrutka, Z. Siki, and B. Takács. V oxel-based point cloud segmentation and building detection.The Interna- tional Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 2022

  36. [36]

    City3d: Large-scale building reconstruction from airborne lidar point clouds.Remote Sensing, 14(9):2254, 2022

    Jin Huang, Jantien Stoter, Ravi Peters, and Liangliang Nan. City3d: Large-scale building reconstruction from airborne lidar point clouds.Remote Sensing, 14(9):2254, 2022

  37. [37]

    Kong, Chaoquan Zhang, and Hongchao Fan

    G. Kong, Chaoquan Zhang, and Hongchao Fan. Large- scale 3-d building reconstruction in lod2 from als point clouds.IEEE Geoscience and Remote Sensing Letters, 22:1–5, 2025

  38. [38]

    Tian, Wenxia Dai, and Ruofei Zhong

    Bisheng Yang, Ronggang Huang, Jianping Li, M. Tian, Wenxia Dai, and Ruofei Zhong. Automated reconstruction of building lods from airborne lidar point clouds using an improved morphological scale space.Remote. Sens., 9:14, 2016

  39. [39]

    PhD thesis, University of Toronto, 2013

    V olodymyr Mnih.Machine Learning for Aerial Image Labeling. PhD thesis, University of Toronto, 2013

  40. [40]

    2d semantic labeling contest

    International Society for Photogrammetry and Remote Sensing (ISPRS). 2d semantic labeling contest. https:// www.isprs.org/resources/datasets/benchmarks/ UrbanSemLab/semantic-labeling.aspx, 2026. Accessed: 2026-02-10

  41. [41]

    SpaceNet: A Remote Sensing Dataset and Challenge Series

    Adam Van Etten, David Lindenbaum, and Todd M. Bacas- tow. Spacenet: A remote sensing dataset and challenge series.ArXiv, abs/1807.01232, 2018

  42. [42]

    Mul- timodal Building Footprint Extraction from Orthophotoa and Lidar Point Clouds Using Deep Learning Framework

    Faezeh Soleimani V ostikolaei and Shabnam Jabari. Mul- timodal Building Footprint Extraction from Orthophotoa and Lidar Point Clouds Using Deep Learning Framework. InIGARSS 2024 - 2024 IEEE International Geoscience and Remote Sensing Symposium, pages 8214–8217, July

  43. [43]

    A review of research on remote sensing images shadow detection and application to building extraction.European Journal of Remote Sensing, 57, 2023

    Xueyan Dong, Jiannong Cao, and Weiheng Zhao. A review of research on remote sensing images shadow detection and application to building extraction.European Journal of Remote Sensing, 57, 2023

  44. [44]

    Thinal Raj, F. H. Hashim, Aqilah Baseri Huddin, Mohd Faisal Ibrahim, and A. Hussain. A survey on li- dar scanning mechanisms.Electronics, 2020

  45. [45]

    Ortho-nerf: generating a true digital orthophoto map using the neural radiance field from unmanned aerial vehicle images.Geo-spatial Information Science, 28(2):741–760, 2025

    Shihan Chen, Qingsong Yan, Yingjie Qu, Wang Gao, Junx- ing Yang, and Fei Deng. Ortho-nerf: generating a true digital orthophoto map using the neural radiance field from unmanned aerial vehicle images.Geo-spatial Information Science, 28(2):741–760, 2025

  46. [46]

    High-Quality Spatial Reconstruction and Orthoimage Generation Using Efficient 2D Gaussian Splatting

    Qian Wang, Zhihao Zhan, Jialei He, Zhituo Tu, Xiang Zhu, and Jie Yuan. High-quality spatial reconstruction and orthoimage generation using efficient 2d gaussian splatting. arXiv preprint arXiv:2503.19703, 2025

  47. [47]

    Tortho-Gaussian: Splatting True Digital Orthophoto Maps

    Xin Wang, Wendi Zhang, Hong Xie, Haibin Ai, Qiangqiang Yuan, and Zongqian Zhan. Tortho-gaussian: Splatting true digital orthophoto maps.arXiv preprint arXiv:2411.19594, 2024

  48. [48]

    Toronto-3d: A large-scale mobile lidar dataset for semantic segmenta- tion of urban roadways

    Weikai Tan, Nannan Qin, Lingfei Ma, Ying Li, Jing Du, Guorong Cai, Ke Yang, and Jonathan Li. Toronto-3d: A large-scale mobile lidar dataset for semantic segmenta- tion of urban roadways. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pages 202–203, 2020

  49. [49]

    Segment PCFootprint: A Large-ScaleDataset andBenchmark forVectorizedBuildingFootprintExtraction fromAerialLiDAR PointClouds14 anything

    Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C Berg, Wan-Yen Lo, et al. Segment PCFootprint: A Large-ScaleDataset andBenchmark forVectorizedBuildingFootprintExtraction fromAerialLiDAR PointClouds14 anything. InProceedings of the IEEE/CVF international conference on comput...

  50. [50]

    Multi-scale adapter based on sam for remote sensing semantic segmentation.IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2025

    Shanjuan Chen, Yunlong Yu, Yingming Li, Zhao Wang, Xi Li, and Jungong Han. Multi-scale adapter based on sam for remote sensing semantic segmentation.IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2025

  51. [51]

    SegEarth-OV3: Exploring SAM 3 for Open-Vocabulary Semantic Segmentation in Remote Sensing Images

    Kaiyu Li, Shengqi Zhang, Yupeng Deng, Zhi Wang, Deyu Meng, and Xiangyong Cao. Segearth-ov3: Exploring sam 3 for open-vocabulary semantic segmentation in remote sensing images.arXiv preprint arXiv:2512.08730, 2025

  52. [52]

    Dynamic dictionary learning for remote sensing image segmentation

    Xuechao Zou, Yue Li, Shun Zhang, Kai Li, Shiying Wang, Pin Tao, Junliang Xing, and Congyan Lang. Dynamic dictionary learning for remote sensing image segmentation. arXiv preprint arXiv:2503.06683, 2025

  53. [53]

    A Diverse Large-Scale Building Dataset and a Novel Plug-and-Play Domain Generalization Method for Building Extraction

    Muying Luo, Shunping Ji, and Shiqing Wei. A Diverse Large-Scale Building Dataset and a Novel Plug-and-Play Domain Generalization Method for Building Extraction. IEEE Journal of Selected Topics in Applied Earth Obser- vations and Remote Sensing, 16:4122–4138, 2023

  54. [54]

    Ting Han, Jin Ma, Chaolei Wang, Yang Luo, Hongchao Fan, José Marcato, Xinchang Zhang, and Yiping Chen. Cityinsight: Incorporating dual-condition based diffusion model into building footprint segmentation from remote sensing imagery.IEEE Transactions on Geoscience and Remote Sensing, 2025

  55. [55]

    Shenao Yuan, Zhen Wang, Jiayuan Li, Nan Xu, Zhuhong You, and Deshuang Huang. Fdenet: Frequency-guided dual-encoder network for building footprint extraction from remote sensing images.IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 18:22403–22420, 2025

  56. [56]

    Polyr-cnn: R-cnn for end-to-end polygonal building out- line extraction.ISPRS Journal of Photogrammetry and Remote Sensing, 218:33–43, 2024

    Weiqin Jiao, Claudio Persello, and George V osselman. Polyr-cnn: R-cnn for end-to-end polygonal building out- line extraction.ISPRS Journal of Photogrammetry and Remote Sensing, 218:33–43, 2024

  57. [57]

    Shiqing Wei, Tao Zhang, Dawen Yu, Shunping Ji, Yongjun Zhang, and Jianya Gong. From lines to Polygons: Polygo- nal building contour extraction from High-Resolution re- mote sensing imagery.ISPRS Journal of Photogrammetry and Remote Sensing, 209:213–232, March 2024

  58. [58]

    A unified framework with multimodal fine- tuning for remote sensing semantic segmentation.IEEE Transactions on Geoscience and Remote Sensing, 63:1–15, 2025

    Xianping Ma, Xiaokang Zhang, Man-On Pun, and Bo Huang. A unified framework with multimodal fine- tuning for remote sensing semantic segmentation.IEEE Transactions on Geoscience and Remote Sensing, 63:1–15, 2025

  59. [59]

    Reproducible ex- traction of building footprints from airborne lidar data: A demo paper

    Mertcan Erdem and Berk Anbaroglu. Reproducible ex- traction of building footprints from airborne lidar data: A demo paper. InProceedings of the 31st ACM Interna- tional Conference on Advances in Geographic Information Systems, pages 1–4, 2023

  60. [60]

    Automatic build- ing footprint extraction from 3d laserscans.ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Infor- mation Sciences, 10:233–240, 2022

    P Rottmann, J-H Haunert, and Y Dehbi. Automatic build- ing footprint extraction from 3d laserscans.ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Infor- mation Sciences, 10:233–240, 2022

  61. [61]

    Anuja Vats, David Völgyes, Martijn Vermeer, Marius Ped- ersen, Kiran Raja, Daniele S. M. Fantin, and Jacob Alexan- der Hay. Terrain-informed self-supervised learning: En- hancing building footprint extraction from lidar data with limited annotations.IEEE Transactions on Geoscience and Remote Sensing, 62:1–10, 2024

  62. [62]

    Microsoft coco: Common objects in context

    Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. Microsoft coco: Common objects in context. InEuropean conference on computer vision, pages 740–755. Springer, 2014

  63. [63]

    SAM 3: Segment Anything with Concepts

    Nicolas Carion, Laura Gustafson, Yuan-Ting Hu, Shoubhik Debnath, Ronghang Hu, Didac Suris, Chaitanya Ryali, Kalyan Vasudev Alwala, Haitham Khedr, Andrew Huang, et al. Sam 3: Segment anything with concepts.arXiv preprint arXiv:2511.16719, 2025