arxiv: 2605.14651 · v1 · submitted 2026-05-14 · 💻 cs.CV

Recognition: no theorem link

TERRA-CD: Multi-Temporal Framework for Multi-class and Semantic Change Detection

Omkar Oak , Rukmini Nazre , Rujuta Budke , Suraj Sawant

Authors on Pith no claims yet

Pith reviewed 2026-05-15 05:29 UTC · model grok-4.3

classification 💻 cs.CV

keywords change detectionremote sensingSentinel-2benchmark dataseturban vegetationsemantic changemulti-temporal

0 comments

The pith

TERRA-CD supplies 5221 Sentinel-2 image pairs across 232 cities with three annotation layers for land-cover and change detection.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces the TERRA-CD dataset to fill the shortage of large, multi-temporal benchmarks for tracking urban vegetation and land-cover shifts. It supplies paired Sentinel-2 images from 2019 and 2024 over 232 cities in the USA and Europe together with three annotation schemes: 4-class land-cover maps, 3-class vegetation-change masks, and 13-class semantic-change masks that record every possible transition. The authors run several deep-learning pipelines, including Siamese networks, STANet variants, Bi-SRNet, ChangeMask, post-classification comparison, and HRSCD, to demonstrate that the data supports both vegetation multi-class change detection and full semantic change detection.

Core claim

The central claim is that the new TERRA-CD benchmark dataset, built from 5221 Sentinel-2 image pairs and equipped with three complementary annotation schemes, enables systematic evaluation of deep-learning methods for vegetation multi-class change detection and semantic change detection.

What carries the argument

The TERRA-CD dataset itself, whose three annotation schemes (4-class land cover, 3-class vegetation change, 13-class semantic transitions) provide the training and test targets for the evaluated change-detection models.

Load-bearing premise

The 232 selected cities and the 2019-2024 interval are assumed to be representative enough for general urban vegetation and semantic change detection tasks.

What would settle it

Performance of the same models on a fresh collection of Sentinel-2 pairs from cities outside the 232-city set, or from a different five-year window, that yields markedly lower accuracy would falsify the claim of broad representativeness.

Figures

Figures reproduced from arXiv: 2605.14651 by Omkar Oak, Rujuta Budke, Rukmini Nazre, Suraj Sawant.

**Figure 2.** Figure 2: Dataset generation process of TERRA-CD on imagery collected by the Copernicus Sentinel-2 mission [13]. Sentinel-2 provides multispectral imagery with a resolution of 10 meters, ideal for vegetation monitoring and urban analysis. We use the Level-2A (L2A) product, which provides atmospherically corrected Surface Reflectance (SR) imagery ensuring more accurate SR values for reliable change detection. 3.3 L… view at source ↗

**Figure 3.** Figure 3: Outline of the LCM Mask generation process [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 5.** Figure 5: Our benchmarking results establish TERRA-CD as a dataset useful for change detection research. The performance metrics across all evaluated models fall within comparable ranges to those reported on other prominent change detection datasets such as HRSCD [3], SECOND [4], and Hi-UCD [11] [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 4.** Figure 4: Model Predictions for MCD Masks of Cleveland, Ohio [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5: Model predictions for SCD Masks: (a) Charlotte, North Carolina and (b) Dortmund, Germany [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

read the original abstract

Urban vegetation monitoring plays a vital role in understanding environmental changes, yet comprehensive datasets for this purpose remain limited. To address this gap, we present the Temporal Remote-sensing Repository for Analyzing Change Detection (TERRA-CD), a benchmark dataset comprising 5,221 Sentinel-2 image pairs from 2019 and 2024, covering 232 cities across the USA and Europe. The dataset features three distinct annotation schemes: 4-class land cover mapping masks, 3-class vegetation change masks, and 13-class semantic change masks capturing all possible land cover transitions. Using various deep learning approaches including Siamese networks, STANet variants, Bi-SRNet, Changemask, Post-Classification Comparison, and HRSCD strategies, we evaluated the dataset's effectiveness for both vegetation Multi-class Change Detection as well as Semantic Change Detection. The proposed dataset and methods are available at https://github.com/omkarsoak/TERRA-CD.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

read the letter

TERRA-CD is a new multi-annotated Sentinel-2 dataset with 5221 pairs across 232 cities that fills a gap for change detection benchmarks, but lacks any reported label validation or performance numbers. The paper releases 2019-2024 image pairs from US and European cities with three annotation layers: 4-class land cover, 3-class vegetation change, and 13-class semantic transitions. This combination of scale, multi-city coverage, and layered labels is not in the prior work the abstract cites, and the authors evaluate several standard models including Siamese networks, STANet variants, Bi-SRNet, and post-classification methods. They also release the data and code on GitHub, which makes it immediately usable as a benchmark for urban vegetation and semantic change tasks. That is the practical value here. The main soft spot is the absence of any annotation quality checks. The 13-class masks in particular encode complex transitions, yet the description gives no inter-annotator agreement, expert review stats, or error rates, so label noise remains an open question that could affect downstream comparisons. The abstract also skips all quantitative results from the model tests, which leaves the effectiveness claim unshown for now. The city and time-period choices look reasonable for recent urban monitoring, but they are presented without much justification. This work is aimed at remote sensing and computer vision researchers who need fresh benchmarks for multi-class change detection. It has enough substance as a data resource to deserve peer review, where the full annotation protocol and results can be examined.

Referee Report

2 major / 1 minor

Summary. The paper introduces the TERRA-CD benchmark dataset consisting of 5,221 Sentinel-2 image pairs from 2019 and 2024 across 232 cities in the USA and Europe. It defines three annotation schemes (4-class land cover masks, 3-class vegetation change masks, and 13-class semantic change masks capturing all land-cover transitions) and evaluates several off-the-shelf deep learning models (Siamese networks, STANet variants, Bi-SRNet, Changemask, Post-Classification Comparison, HRSCD) on vegetation multi-class change detection and semantic change detection tasks.

Significance. If the annotations are shown to be reliable, TERRA-CD would supply a large-scale, multi-label temporal benchmark that fills a documented gap in urban vegetation and semantic transition datasets; the multi-scheme design (pixel-level land cover, vegetation change, and full transition semantics) would enable direct comparison of single-task versus joint modeling approaches.

major comments (2)

[Abstract] Abstract: the claim that the dataset's effectiveness was evaluated with the listed models is unsupported because no quantitative results (accuracy, F1, IoU, or confusion matrices) or error analysis are supplied, leaving the central benchmark claim without empirical grounding.
[Dataset construction] Dataset construction section: no inter-annotator agreement, expert review protocol, or label-consistency statistics are reported for the 13-class semantic change masks; without these, label noise in the transition classes could undermine all downstream model comparisons.

minor comments (1)

The GitHub link is given but the manuscript does not specify data formats, licensing terms, or exact train/validation/test splits used in the reported experiments.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment point by point below and will revise the paper to strengthen the empirical support and annotation details.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that the dataset's effectiveness was evaluated with the listed models is unsupported because no quantitative results (accuracy, F1, IoU, or confusion matrices) or error analysis are supplied, leaving the central benchmark claim without empirical grounding.

Authors: We agree that the abstract's reference to evaluation requires explicit quantitative grounding. The manuscript describes the models evaluated but does not embed specific metrics or error analysis in the abstract itself. In the revised version we will update the abstract to summarize key results (e.g., F1 and IoU scores across the vegetation multi-class and semantic change tasks) and add a concise error-analysis paragraph or subsection that reports confusion-matrix insights and failure modes. revision: yes
Referee: [Dataset construction] Dataset construction section: no inter-annotator agreement, expert review protocol, or label-consistency statistics are reported for the 13-class semantic change masks; without these, label noise in the transition classes could undermine all downstream model comparisons.

Authors: We acknowledge that explicit reliability statistics are necessary for a benchmark dataset. The 13-class masks were produced by remote-sensing experts using a documented labeling guideline, yet inter-annotator agreement and consistency metrics were omitted from the original submission. We will expand the Dataset construction section to describe the expert review protocol in detail and report inter-annotator agreement (Cohen's kappa) computed on a held-out subset of the 13-class annotations. revision: yes

Circularity Check

0 steps flagged

No circularity: dataset paper with independent contribution

full rationale

The paper introduces the TERRA-CD benchmark dataset of Sentinel-2 image pairs with three annotation schemes and evaluates it using standard off-the-shelf models (Siamese networks, STANet, Bi-SRNet, etc.). No equations, parameter fits, or derivations are present that could reduce to inputs by construction. No self-citation chains or uniqueness theorems are invoked to support the central claim. The contribution is the data resource itself, which stands independently of any fitted results or renamed patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No free parameters, mathematical axioms, or invented entities are introduced; the work relies on existing Sentinel-2 imagery, standard deep-learning architectures, and conventional remote-sensing annotation practices.

pith-pipeline@v0.9.0 · 5472 in / 1179 out tokens · 52490 ms · 2026-05-15T05:29:01.640433+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

27 extracted references · 27 canonical work pages

[1]

& Shi, Z

Chen, H. & Shi, Z. A spatial-temporal attention-based method and a new dataset for remote sensing image change detection.Remote Sensing.12pp. 1662 (2020)

work page 2020
[2]

& Matsuoka, M

Adriano, B., Yokoya, N., Xia, J., Miura, H., Liu, W. & Matsuoka, M. Learning from multimodal and multitemporal earth observation data for building damage mapping. ISPRS Journal Of Photogrammetry And Remote Sensing.175pp. 132-143 (2021)

work page 2021
[3]

& Gousseau, Y

Daudt, R., Le Saux, B., Boulch, A. & Gousseau, Y. Multitask learning for large- scale semantic change detection.Computer Vision And Image Understanding.187 pp. 102783 (2019)

work page 2019
[4]

& Zhang, L

Yang, K., Xia, G., Liu, Z., Du, B., Yang, W., Pelillo, M. & Zhang, L. Asymmetric siamese networks for semantic change detection in aerial images.IEEE Transactions On Geoscience And Remote Sensing.60pp. 1-18 (2021)

work page 2021
[5]

Zhu, Q., Guo, X., Li, Z. & Li, D. A review of multi-class change detection for satellite remote sensing imagery.Geo-spatial Information Science.27, 1-15 (2024)

work page 2024
[6]

& Mou, L

Lyu, H., Lu, H. & Mou, L. Learning a Transferable Change Rule from a Recurrent Neural Network for Land Cover Change Detection.Remote Sensing.8pp. 506 (2016)

work page 2016
[7]

& Zhang, L

Du, B., Ru, L., Wu, C. & Zhang, L. Unsupervised Deep Slow Feature Analysis for Change Detection in Multi-Temporal Remote Sensing Images.IEEE Transactions On Geoscience And Remote Sensing.57pp. 9976-9992 (2019)

work page 2019
[8]

& Zhang, L

Wu, C., Zhang, L. & Zhang, L. A Scene Change Detection Framework for Multi- Temporal Very High Resolution Remote Sensing Images.Signal Processing.124pp. 184-197 (2016)

work page 2016
[9]

& Kim, Y

Song, A., Choi, J., Han, Y. & Kim, Y. Change Detection in Hyperspectral Images Using Recurrent 3D Fully Convolutional Networks.Remote Sensing.10pp. 1827 (2018)

work page 2018
[10]

& Bovolo, F

Liu, S., Marinelli, D., Bruzzone, L. & Bovolo, F. A Review of Change Detection in Multitemporal Hyperspectral Images: Current Techniques, Applications, and Chal- lenges.IEEE Geoscience And Remote Sensing Magazine.7pp. 140-158 (2019)

work page 2019
[11]

& Zhang, L

Tian, S., Zhong, Y., Zheng, Z., Ma, A., Tan, X. & Zhang, L. Large-scale deep learning based binary and semantic change detection in ultra high resolution remote sensing imagery: From benchmark datasets to urban application.ISPRS Journal Of Photogrammetry And Remote Sensing.193pp. 164-186 (2022)

work page 2022
[12]

& Stanley, D

Hussain, M., Chen, D., Cheng, A., Wei, H. & Stanley, D. Change detection from re- motely sensed images: From pixel-based to object-based approaches.ISPRS Journal Of Photogrammetry And Remote Sensing.80pp. 91-106 (2013)

work page 2013
[13]

Available online: https: //documentation.dataspace.copernicus.eu/Data/SentinelMissions/Sentinel2.html/, Accessed 12 Dec 2025

ESA Copernicus Sentinel-2 Mission Documentation. Available online: https: //documentation.dataspace.copernicus.eu/Data/SentinelMissions/Sentinel2.html/, Accessed 12 Dec 2025

work page 2025
[14]

& Harlan, J

Rouse, J., Haas, R., Deering, D., Schell, J. & Harlan, J. Monitoring the Ver- nal Advancement and Retrogradation (Green Wave Effect) of Natural Vegetation. (1974), Available online: https://ntrs.nasa.gov/citations/19750020419, Accessed 14 Dec 2025

work page arXiv 1974
[15]

& Gadawska, A

Milczarek, M., Robak, A. & Gadawska, A. Sentinel Water Mask (SWM) - New index for water detection on Sentinel-2 Images.Proceedings Of The 7th Advanced Training Course On Land Remote Sensing. (2017)

work page 2017
[16]

& Yang, C

Liu, X. & Yang, C. A Kernel Spectral Angle Mapper Algorithm for Remote Sensing Image Classification.Proceedings Of The 2013 6th International Congress On Image And Signal Processing (CISP).2pp. 814-818 (2013)

work page 2013
[17]

Semi-Automatic Classification Plugin: A Python Tool for the Down- load and Processing of Remote Sensing Images in QGIS.Journal Of Open Source Software.6, 3172 (2021)

Congedo, L. Semi-Automatic Classification Plugin: A Python Tool for the Down- load and Processing of Remote Sensing Images in QGIS.Journal Of Open Source Software.6, 3172 (2021)

work page 2021
[18]

& Belongie, S

Cui, Y., Jian, M., Lin, T., Song, Y. & Belongie, S. Class-Balanced Loss Based on Effective Number of Samples.IEEE/CVF Conference On Computer Vision And Pattern Recognition (CVPR). pp. 9268-9277 (2019)

work page 2019
[19]

& Hutter, F

Loshchilov, I. & Hutter, F. Decoupled Weight Decay Regularization.International Conference On Learning Representations (ICLR). (2019)

work page 2019
[20]

& Boulch, A

Daudt, R., Le Saux, B. & Boulch, A. Fully convolutional siamese networks for change detection.IEEE International Conference On Image Processing (ICIP). pp. 4063-4067 (2018)

work page 2018
[21]

Fang, S., Li, K., Shao, J. & Li, Z. SNUNet-CD: A densely connected Siamese network for change detection of VHR images.IEEE Geoscience And Remote Sensing Letters.19pp. 1-5 (2021)

work page 2021
[22]

& Brox, T

Ronneberger, O., Fischer, P. & Brox, T. U-Net: Convolutional networks for biomed- ical image segmentation.Medical Image Computing And Computer-Assisted Inter- vention (MICCAI). pp. 234-241 (2015)

work page 2015
[23]

& Culurciello, E

Chaurasia, A. & Culurciello, E. LinkNet: Exploiting encoder representations for ef- ficient semantic segmentation.IEEE Visual Communications And Image Processing (VCIP). pp. 1-4 (2017)

work page 2017
[24]

& Adam, H

Chen, L., Zhu, Y., Papandreou, G., Schroff, F. & Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation.European Conference On Computer Vision (ECCV). pp. 801-818 (2018)

work page 2018
[25]

& Jia, J

Zhao, H., Shi, J., Qi, X., Wang, X. & Jia, J. Pyramid Scene Parsing Network.IEEE Conference On Computer Vision And Pattern Recognition (CVPR). pp. 2881-2890 (2017)

work page 2017
[26]

& Bruzzone, L

Ding, L., Guo, H., Liu, S., Mou, L., Zhang, J. & Bruzzone, L. Bi-temporal semantic reasoning for the semantic change detection in HR remote sensing images.IEEE Transactions On Geoscience And Remote Sensing.60pp. 1-14 (2022)

work page 2022
[27]

& Zhang, L

Zheng, Z., Zhong, Y., Tian, S., Ma, A. & Zhang, L. ChangeMask: Deep multi- task encoder-transformer-decoder architecture for semantic change detection.IS- PRS Journal Of Photogrammetry And Remote Sensing.183pp. 228-239 (2022)

work page 2022