Bridging Spatial And Frequency Views For Disaster Assessment: Benefits And Limitations
Pith reviewed 2026-06-27 02:17 UTC · model grok-4.3
The pith
Dual-domain models achieve higher accuracy than single-domain ones for classifying building damage in satellite imagery, though frequency-only approaches perform worst.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Dual-domain models that combine spatial and frequency representations achieve the highest test accuracy of 0.4688 and lowest loss, outperforming single-domain models, with the spatial-only model reaching the best macro F1-score of 0.4254. Frequency-only models perform worst and exhibit overfitting. All models struggle with the Minor damage class owing to imbalance and ambiguity, but dual-domain fusion improves detection of severe damage levels.
What carries the argument
Dual-domain fusion strategy that processes both spatial and frequency representations of imagery through an EfficientNet-B0 backbone for multi-class damage classification.
If this is right
- Dual spatial configurations deliver the highest accuracy and lowest loss compared to single-domain baselines.
- Spatial-only models achieve superior balanced performance across classes via the best macro F1-score.
- Frequency-only inputs lead to the lowest performance and clear overfitting on the test set.
- Dual-domain fusion improves detection of severe damage classes more than minor ones.
- Class imbalance and fine-grained visual ambiguity limit accuracy for subtle damage levels across every configuration.
Where Pith is reading between the lines
- Techniques for handling class imbalance could further boost the modest gains from dual-domain inputs.
- The approach might extend usefully to other remote sensing tasks where texture cues complement spatial structure.
- Additional experiments on varied disaster datasets would test whether the observed dual-domain benefits hold more broadly.
Load-bearing premise
Performance differences arise solely from the choice of spatial, frequency, or dual inputs because every model uses the identical EfficientNet-B0 backbone and training settings.
What would settle it
Retraining all three model types on a balanced version of the dataset or with a different backbone and checking whether the dual-domain accuracy advantage and frequency-only overfitting disappear.
Figures
read the original abstract
Rapid assessment of building damage from satellite imagery is essential for effective disaster response and recovery. While most deep learning methods rely on spatial-domain features, frequency-domain representations can capture complementary structural cues such as debris patterns and collapse-induced textures. This study presents a controlled comparison of spatial-domain, frequency-domain, and dual-domain deep learning approaches for multi-class building damage classification using post-disaster imagery from the xView2 (xBD) dataset. To ensure fairness, all models are built on an EfficientNet-B0 backbone and trained under identical settings, differing only in their input representations and fusion strategies. Performance is evaluated using accuracy, macro F1-score, per-class metrics, and confusion matrices. Results show that dual-domain models provide measurable improvements over single-domain approaches. The dual spatial configuration achieves the highest test accuracy (0.4688) and lowest loss, while the spatial-only model attains the best macro F1-score (0.4254), indicating more balanced class performance. In contrast, frequency-only models perform worst and exhibit overfitting, suggesting limited generalization. Despite these gains, all models struggle to detect subtle damage levels, particularly the Minor class, due to class imbalance and fine-grained visual ambiguity. While dual-domain approaches improve detection of severe damage, challenges remain. These findings highlight the benefits and limitations of hybrid representations and motivate future work on data balancing, advanced fusion, and regularization.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript conducts a controlled empirical comparison of spatial-domain, frequency-domain, and dual-domain models (all using an EfficientNet-B0 backbone) for multi-class building damage classification on the xBD dataset. It claims that dual-domain models yield measurable improvements over single-domain baselines, with the dual-spatial configuration attaining the highest test accuracy (0.4688) and lowest loss, the spatial-only model achieving the best macro F1-score (0.4254), and frequency-only models performing worst with signs of overfitting; all models struggle with the Minor damage class due to imbalance and visual ambiguity.
Significance. If the experimental controls are shown to isolate domain effects, the work provides a useful data point on the practical value of frequency representations for capturing structural cues in post-disaster imagery. The explicit reporting of per-class metrics, confusion matrices, and the observation that dual-domain helps severe damage but not subtle levels could inform future hybrid architectures, though the modest absolute performance levels and persistent class-imbalance issues limit immediate applicability.
major comments (2)
- [Abstract, paragraph 2] Abstract, paragraph 2: The statement that 'all models are built on an EfficientNet-B0 backbone and trained under identical settings, differing only in their input representations and fusion strategies' is load-bearing for the central claim of domain-driven improvements, yet the manuscript provides no description of per-domain normalization, augmentation policies, or learning-rate scaling to compensate for the radically different value ranges, sparsity, and noise statistics of frequency inputs (e.g., Fourier magnitude/phase) versus RGB spatial images. This leaves open the possibility that observed gaps (dual-spatial acc 0.4688 vs. frequency-only underperformance) arise from optimization mismatch rather than representational complementarity.
- [Results] Results section (implied by reported metrics): The abstract states concrete numbers (accuracy 0.4688, macro F1 0.4254) but supplies no error bars, statistical significance tests, or validation curves. Without these, it is impossible to determine whether the 'measurable improvements' of dual-domain over single-domain are reliable or within the variance expected from random initialization and class imbalance.
minor comments (1)
- [Abstract] The abstract mentions 'per-class metrics and confusion matrices' but does not indicate whether these are included in the main text or supplementary material; adding a table or figure reference would improve traceability.
Simulated Author's Rebuttal
We thank the referee for the constructive comments that highlight important aspects of experimental rigor. We address each major comment below and indicate the planned revisions.
read point-by-point responses
-
Referee: [Abstract, paragraph 2] Abstract, paragraph 2: The statement that 'all models are built on an EfficientNet-B0 backbone and trained under identical settings, differing only in their input representations and fusion strategies' is load-bearing for the central claim of domain-driven improvements, yet the manuscript provides no description of per-domain normalization, augmentation policies, or learning-rate scaling to compensate for the radically different value ranges, sparsity, and noise statistics of frequency inputs (e.g., Fourier magnitude/phase) versus RGB spatial images. This leaves open the possibility that observed gaps (dual-spatial acc 0.4688 vs. frequency-only underperformance) arise from optimization mismatch rather than representational complementarity.
Authors: We agree that the manuscript lacks explicit details on domain-specific preprocessing and training adjustments, which weakens the isolation of representational effects. In the revised version we will expand the Methods section with a full description of normalization (e.g., log-scaling and per-channel standardization for Fourier magnitude/phase), augmentation policies applied consistently across domains, and any learning-rate or optimizer adjustments made to accommodate the different input statistics. These additions will allow readers to evaluate whether the reported gaps reflect domain complementarity. revision: yes
-
Referee: [Results] Results section (implied by reported metrics): The abstract states concrete numbers (accuracy 0.4688, macro F1 0.4254) but supplies no error bars, statistical significance tests, or validation curves. Without these, it is impossible to determine whether the 'measurable improvements' of dual-domain over single-domain are reliable or within the variance expected from random initialization and class imbalance.
Authors: We concur that variability measures are needed to substantiate the claimed improvements. The revised manuscript will report means and standard deviations from at least three independent runs with different random seeds, include learning curves in the supplementary material, and add a statistical comparison (e.g., paired t-tests) between the dual-domain and single-domain configurations. These changes will address concerns about reliability under random initialization and class imbalance. revision: yes
Circularity Check
No circularity: purely empirical comparison with held-out test metrics
full rationale
This is an empirical machine-learning study reporting accuracy, F1, and loss from models trained on the xBD dataset with held-out test evaluation. No derivations, equations, predictions, or uniqueness claims appear that could reduce to fitted parameters or self-citations by construction. The central claim (dual-domain gains) rests on experimental outcomes under stated identical training settings, not on any definitional loop or imported ansatz. The fairness assumption is an experimental design choice open to external verification, not a self-referential reduction.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption EfficientNet-B0 is a suitable backbone for multi-class building damage classification from satellite imagery
Reference graph
Works this paper leans on
-
[1]
Geospatial and deep learning approaches for modeling floodwater depth in urbanized areas,
J. Blay and L. Hashemi-Beni, “Geospatial and deep learning approaches for modeling floodwater depth in urbanized areas,” Remote Sensing, vol. 18, no. 1, p. 60, 2025
2025
-
[2]
Deep learning models for hazard-damaged building detection using remote sensing datasets: A comprehensive review,
L. Wang, J. Wu, Y . Yang, R. Tang, and R. Ya, “Deep learning models for hazard-damaged building detection using remote sensing datasets: A comprehensive review,”IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2024
2024
-
[3]
xbd: A dataset for as- sessing building damage from satellite imagery,
R. Gupta, R. Hosfelt, S. Sajeev, N. Patel, B. Goodman, J. Doshi, E. Heim, H. Choset, and M. Gaston, “xbd: A dataset for as- sessing building damage from satellite imagery,”arXiv preprint arXiv:1911.09296, 2019
arXiv 1911
-
[4]
Cdf-net: A convolutional neural network fusing frequency domain and spatial domain features,
A. Yang, M. Li, Z. Wu, Y . He, X. Qiu, Y . Song, W. Du, and Y . Gou, “Cdf-net: A convolutional neural network fusing frequency domain and spatial domain features,”IET computer vision, vol. 17, no. 3, pp. 319–329, 2023
2023
-
[5]
Khankeshizadeh, A
E. Khankeshizadeh, A. Mohammadzadeh, and S. Jamali, “Edb- hsteu-net: Earthquake-damaged building detection using a novel hybrid swin transformer efficient u-net (hsteu-net) and transfer learning techniques from post-event vhr remote sensing data,” Journal of Building Engineering, p. 112889, 2025
2025
-
[6]
Comparison on difference deep learning models for building damage assessment using xbd dataset,
R. Benedict, R. B. Winartio, M. F. Adinata, E. Irwansyah et al., “Comparison on difference deep learning models for building damage assessment using xbd dataset,” in2024 Arab ICT Conference (AICTC). IEEE, 2024, pp. 181–186
2024
-
[7]
Rescueadi: adap- tive disaster interpretation in remote sensing images with autonomous agents,
Z. Liu, D. Zhao, B. Yuan, and Z. Jiang, “Rescueadi: adap- tive disaster interpretation in remote sensing images with autonomous agents,”IEEE Transactions on Geoscience and Remote Sensing, 2025
2025
-
[8]
Efficientnet: Rethinking model scaling for convolutional neural networks,
M. Tan and Q. Le, “Efficientnet: Rethinking model scaling for convolutional neural networks,” inInternational conference on machine learning. PMLR, 2019, pp. 6105–6114
2019
-
[9]
Do vision transformers see like convolutional neural networks?
M. Raghu, T. Unterthiner, S. Kornblith, C. Zhang, and A. Doso- vitskiy, “Do vision transformers see like convolutional neural networks?”Advances in neural information processing systems, vol. 34, pp. 12 116–12 128, 2021
2021
-
[10]
Hrtbda: a network for post-disaster building damage assess- ment based on remote sensing images,
F. Chen, Y . Sun, L. Wang, N. Wang, H. Zhao, and B. Yu, “Hrtbda: a network for post-disaster building damage assess- ment based on remote sensing images,”International Journal of Digital Earth, vol. 17, no. 1, p. 2418880, 2024
2024
-
[11]
Building damage detection using u-net with attention mechanism from pre-and post-disaster remote sensing datasets,
C. Wu, F. Zhang, J. Xia, Y . Xu, G. Li, J. Xie, Z. Du, and R. Liu, “Building damage detection using u-net with attention mechanism from pre-and post-disaster remote sensing datasets,” Remote Sensing, vol. 13, no. 5, p. 905, 2021
2021
-
[12]
Ddformer: A dual-domain transformer for building damage detection using high-resolution sar imagery,
T. Li, C. Wang, H. Zhang, F. Wu, and X. Zheng, “Ddformer: A dual-domain transformer for building damage detection using high-resolution sar imagery,”IEEE Geoscience and Remote Sensing Letters, vol. 20, pp. 1–5, 2023
2023
-
[13]
Joint frequency-spatial domain network for remote sensing optical image change detec- tion,
Y . Zhou, Y . Feng, S. Huo, and X. Li, “Joint frequency-spatial domain network for remote sensing optical image change detec- tion,”IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–14, 2022
2022
-
[14]
Building damage assessment in natural disasters: A trans-and interdisciplinary approach combining domain knowl- edge, 3d machine learning, and crowdsourcing,
J. Kohns, V . Zahs, C. Klonner, B. H ¨ofle, L. Stempniewski, and A. Stark, “Building damage assessment in natural disasters: A trans-and interdisciplinary approach combining domain knowl- edge, 3d machine learning, and crowdsourcing,”Progress in Disaster Science, p. 100427, 2025
2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.