pith. machine review for the scientific record. sign in

arxiv: 2604.27323 · v1 · submitted 2026-04-30 · 📡 eess.IV · cs.CV

Recognition: unknown

Representative Spectral Correlation Network for Multi-source Remote Sensing Image Classification

Authors on Pith no claims yet

Pith reviewed 2026-05-07 09:01 UTC · model grok-4.3

classification 📡 eess.IV cs.CV
keywords multi-source classificationhyperspectral imagingband selectionfeature fusionremote sensingland coverSAR dataLiDAR data
0
0 comments X

The pith

The Representative Spectral Correlation Network selects task-relevant spectral bands from hyperspectral images under cross-source guidance and performs adaptive fusion for superior multi-source classification.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes a framework to handle the fusion of hyperspectral images with SAR or LiDAR data for land-cover classification. High-dimensional hyperspectral data often contains redundant spectral information, while data from different sensors have mismatched characteristics that complicate integration. The method uses cross-source information to select the most relevant bands and then refines their interaction through attention and contextual modules. A reader would care because this promises more accurate mapping of land surfaces using available satellite data without excessive computational demands.

Core claim

RSCNet incorporates a Key Band Selection Module that adaptively selects task-relevant spectral bands from the original hyperspectral image under cross-source guidance, alleviating redundancy and information loss. It also includes a Cross-source Adaptive Fusion Module that performs cross-source attention weighting and local-global contextual refinement. Experiments on three public benchmark datasets show this achieves superior performance compared to state-of-the-art methods with substantially lower computational complexity.

What carries the argument

The Key Band Selection Module (KBSM), which uses guidance from complementary sensor data to choose a compact subset of spectral bands that retain discriminative power aligned with semantic classes.

If this is right

  • The learned band subset reduces spectral redundancy while preserving alignment with land-cover semantics.
  • Cross-source attention enhances feature interaction between heterogeneous data sources.
  • The network delivers higher classification accuracy on benchmark datasets.
  • Overall computational complexity is reduced compared to full-band or PCA-based approaches.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • This band selection strategy could be tested on additional sensor pairs to check consistency without retraining.
  • It may reduce reliance on manual band engineering in hyperspectral analysis pipelines.
  • Extensions to tasks like semantic segmentation or change detection in multi-source data could be explored.

Load-bearing premise

The band subset selected with guidance from one sensor type will maintain highly discriminative structures that match semantic cues across different datasets and new sensor combinations.

What would settle it

Running the model on a held-out multi-source dataset and checking if the selected bands fail to capture known discriminative wavelengths, resulting in no accuracy gain over baseline fusion techniques.

Figures

Figures reproduced from arXiv: 2604.27323 by Chuanzheng Gong, Feng Gao, Junyan Lin, Junyu Dong, Qian Du.

Figure 1
Figure 1. Figure 1: Comparison between existing multi-source data classification models view at source ↗
Figure 2
Figure 2. Figure 2: The framework of the proposed Representative Spectral Correlation Network (RSCNet). The framework takes HSI, its PCA-reduced counterpart, and SAR/LiDAR data as inputs, which are individually encoded to obtain specific feature representations. The spectral-reduced HSI and SAR/LiDAR features are first integrated through the Cross-source Adaptive Fusion Module (CAFM). Guided by the fused features, the Key Ban… view at source ↗
Figure 4
Figure 4. Figure 4: Illustration of the Cross-source Adaptive Fusion Module (CAFM). It view at source ↗
Figure 7
Figure 7. Figure 7: Effect of the number of RSCB on classification performance on three view at source ↗
Figure 8
Figure 8. Figure 8: Effect of the input patch size on classification performance on three view at source ↗
Figure 9
Figure 9. Figure 9: Visualized classification results of different methods for the Augsburg dataset. (a) FusAtNet. (b) S view at source ↗
Figure 10
Figure 10. Figure 10: Visualized classification results of different methods for the Berlin dataset. (a) FusAtNet. (b) S view at source ↗
Figure 11
Figure 11. Figure 11: Visualized classification results of different methods for the Houston2013 dataset. (a) FusAtNet. (b) S view at source ↗
Figure 12
Figure 12. Figure 12: Comparison of feature separability using t-SNE visualization: (a) view at source ↗
Figure 13
Figure 13. Figure 13: Feature fusion validation on three datasets. view at source ↗
read the original abstract

Hyperspectral image (HSI) and SAR/LiDAR data offer complementary spectral and structural information for land-cover classification. However, their effective fusion remains challenging due to two major limitations: The spectral redundancy in high-dimensional HSI and the heterogeneous characteristics between multi-source data. To this end, we propose Representative Spectral Correlation Network (RSCNet), a novel multi-source image classification framework specifically designed to address the above challenges through spectral selection and adaptive interaction. The network incorporates two key components: (1) Key Band Selection Module (KBSM) that adaptively selects task-relevant spectral bands from the original HSI under cross-source guidance, thereby alleviating redundancy and mitigating information loss from conventional PCA-based spectral reduction. Moreover, the learned band subset exhibits highly discriminative spectral structures that align with discriminative semantic cues, promoting compact yet expressive representations. (2) Cross-source Adaptive Fusion Module (CAFM) that performs cross-source attention weighting and local-global contextual refinement to enhance cross-source feature interaction. Experiments on three public benchmark datasets demonstrate that our RSCNet achieves superior performance compared with state-of-the-art methods, while maintaining substantially lower computational complexity. Our codes are publicly available at https://github.com/oucailab/RSCNet.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces RSCNet, a multi-source remote sensing classification framework for fusing hyperspectral images (HSI) with SAR/LiDAR data. It addresses spectral redundancy in HSI and heterogeneous data characteristics via two modules: the Key Band Selection Module (KBSM), which adaptively selects task-relevant bands from HSI under cross-source guidance, and the Cross-source Adaptive Fusion Module (CAFM), which applies cross-source attention weighting plus local-global contextual refinement. Experiments on three public benchmarks report superior accuracy over state-of-the-art methods at substantially lower computational complexity, with code released publicly.

Significance. If the empirical gains are robust, the work offers a practical advance in multi-source fusion by replacing generic dimensionality reduction (e.g., PCA) with guided band selection that preserves semantic discriminability while lowering complexity. Public code is a clear strength that enables direct verification and extension.

major comments (2)
  1. [Experiments] Experiments section: the central claim of superior performance and lower complexity rests on benchmark tables, yet no quantitative ablation isolating the contribution of KBSM versus CAFM, no error bars across multiple runs, and no explicit training-protocol details (optimizer, learning-rate schedule, data splits) are provided. Without these, it is impossible to rule out that gains arise from post-hoc hyperparameter choices rather than the proposed architecture.
  2. [Key Band Selection Module] KBSM description: the assertion that the learned band subset 'exhibits highly discriminative spectral structures that align with semantic cues' is stated without supporting analysis (e.g., band-selection visualizations, correlation with ground-truth classes, or comparison to unsupervised selection baselines). This claim is load-bearing for the motivation that KBSM avoids information loss, yet remains unverified beyond end-to-end accuracy.
minor comments (2)
  1. [Method] Notation for the number of selected bands and attention heads should be introduced once with explicit symbols rather than described only in prose.
  2. [Figures] Figure captions for network diagrams should explicitly label the input sources (HSI, SAR/LiDAR) and the flow through KBSM and CAFM.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback and the recommendation for major revision. We address each major comment point by point below, agreeing where additional evidence is needed and outlining the specific revisions we will implement.

read point-by-point responses
  1. Referee: [Experiments] Experiments section: the central claim of superior performance and lower complexity rests on benchmark tables, yet no quantitative ablation isolating the contribution of KBSM versus CAFM, no error bars across multiple runs, and no explicit training-protocol details (optimizer, learning-rate schedule, data splits) are provided. Without these, it is impossible to rule out that gains arise from post-hoc hyperparameter choices rather than the proposed architecture.

    Authors: We agree that these elements are required for rigorous validation and to isolate architectural contributions. In the revised manuscript we will add: (1) quantitative ablation tables measuring the individual and joint impact of KBSM and CAFM on accuracy and complexity; (2) mean and standard-deviation error bars computed over at least five independent runs with different random seeds; and (3) complete training-protocol details (optimizer, learning-rate schedule, batch size, and exact train/validation/test splits) for each of the three benchmarks. Because the code is already public, these additions will enable direct reproduction and verification. revision: yes

  2. Referee: [Key Band Selection Module] KBSM description: the assertion that the learned band subset 'exhibits highly discriminative spectral structures that align with semantic cues' is stated without supporting analysis (e.g., band-selection visualizations, correlation with ground-truth classes, or comparison to unsupervised selection baselines). This claim is load-bearing for the motivation that KBSM avoids information loss, yet remains unverified beyond end-to-end accuracy.

    Authors: We acknowledge that the claim requires direct empirical support. In the revision we will add: band-selection visualizations for representative samples from each dataset, quantitative correlation analysis between selected bands and ground-truth class labels, and side-by-side comparisons against unsupervised baselines (e.g., PCA and other standard band-selection methods). These additions will substantiate that the KBSM selects semantically aligned bands while reducing redundancy. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical claims on public benchmarks

full rationale

The paper proposes RSCNet with KBSM (band selection under cross-source guidance) and CAFM (adaptive fusion) modules for HSI/SAR/LiDAR classification. Central claims rest on experimental results showing superior accuracy and lower complexity versus SOTA on three public datasets, with code released. No mathematical derivation chain, fitted-parameter predictions, or self-citation load-bearing steps appear in the abstract or described method. The architecture is presented as a novel design choice validated externally rather than derived from inputs by construction, making the argument self-contained.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The method assumes that (1) a subset of spectral bands selected under cross-source guidance will be more discriminative than PCA-reduced bands and (2) attention weighting plus local-global refinement will resolve heterogeneous feature spaces without introducing new artifacts. Both are domain assumptions rather than derived results.

free parameters (1)
  • Number of selected bands and attention heads
    Hyperparameters that control the size of the band subset and the fusion module; their values are chosen to optimize benchmark performance.
axioms (2)
  • domain assumption Cross-source guidance from SAR/LiDAR can reliably identify task-relevant HSI bands without losing critical semantic information.
    Invoked in the description of KBSM; no proof or external validation supplied in the abstract.
  • domain assumption Attention-based weighting plus local-global refinement is sufficient to align heterogeneous multi-source features.
    Core premise of CAFM.

pith-pipeline@v0.9.0 · 5522 in / 1441 out tokens · 27549 ms · 2026-05-07T09:01:49.455815+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

64 extracted references

  1. [1]

    Advanced multi-sensor optical remote sensing for urban land use and land cover classification: Outcome of the 2018 ieee grss data fusion contest,

    Y . Xu, B. Du, L. Zhang, D. Cerra, M. Pato, E. Carmona, S. Prasad, N. Yokoya, R. H¨ansch, and B. Le Saux, “Advanced multi-sensor optical remote sensing for urban land use and land cover classification: Outcome of the 2018 ieee grss data fusion contest,”IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 12, no. 6, pp. 17...

  2. [2]

    Dynamic frequency feature fusion network for multisource remote sensing data classification,

    Y . Zhao, F. Gao, X. Jin, J. Dong, and Q. Du, “Dynamic frequency feature fusion network for multisource remote sensing data classification,”IEEE Geoscience and Remote Sensing Letters, vol. 22, pp. 1–5, 2025

  3. [3]

    A review of remote sensing for environmental monitoring in china,

    J. Li, Y . Pei, S. Zhao, R. Xiao, X. Sang, and C. Zhang, “A review of remote sensing for environmental monitoring in china,”Remote Sensing, vol. 12, no. 7, p. 1130, 2020

  4. [4]

    Assisting flood disaster response with earth observation data and products: A critical assessment,

    G. J. Schumann, G. R. Brakenridge, A. J. Kettner, R. Kashif, and E. Niebuhr, “Assisting flood disaster response with earth observation data and products: A critical assessment,”Remote Sensing, vol. 10, no. 8, p. 1230, 2018

  5. [5]

    Satellite remote sensing for water resources management: Potential for supporting sustainable development in data- poor regions,

    J. Sheffield, E. F. Wood, M. Pan, H. Beck, G. Coccia, A. Serrat- Capdevila, and K. Verbist, “Satellite remote sensing for water resources management: Potential for supporting sustainable development in data- poor regions,”Water Resources Research, vol. 54, no. 12, pp. 9724– 9758, 2018

  6. [6]

    Sar and insar for flood monitoring: Examples with cosmo-skymed data,

    A. Refice, D. Capolongo, G. Pasquariello, A. D’Addabbo, F. Bovenga, R. Nutricato, F. P. Lovergine, and L. Pietranera, “Sar and insar for flood monitoring: Examples with cosmo-skymed data,”IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 7, no. 7, pp. 2711–2722, 2014

  7. [7]

    Terrestrial high-rate centimeter-level sea surface height monitoring for about one month using gnss-r,

    X. Meng, F. Gao, T. Xu, Y . He, N. Wang, Y . Liu, X. Li, Z. Zhong, and L. Zhang, “Terrestrial high-rate centimeter-level sea surface height monitoring for about one month using gnss-r,”IEEE Geoscience and Remote Sensing Letters, vol. 22, pp. 1–5, 2025

  8. [8]

    Road urban planning sustainability based on remote sensing and satellite dataset: A review,

    K. H. Mhana, S. B. Norhisham, H. Y . Katman, and Z. M. Yaseen, “Road urban planning sustainability based on remote sensing and satellite dataset: A review,”Heliyon, 2024

  9. [9]

    Toward robust cross-view vehicle localization in complex urban environments,

    S. Wang, W. Wu, Z. Guo, and C. Bai, “Toward robust cross-view vehicle localization in complex urban environments,”IEEE Geoscience and Remote Sensing Letters, vol. 22, pp. 1–5, 2025

  10. [10]

    An urban land cover classification method based on segments’ multi-dimension feature fusion,

    Z. Huang, J. Cheng, G. Wei, X. Hua, and Y . Wang, “An urban land cover classification method based on segments’ multi-dimension feature fusion,”IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2024

  11. [11]

    Deep learning in remote sensing image fusion: Methods, protocols, data, and future perspectives,

    G. Vivone, L.-J. Deng, S. Deng, D. Hong, M. Jiang, C. Li, W. Li, H. Shen, X. Wu, J.-L. Xiao, J. Yao, M. Zhang, J. Chanussot, S. Garc ´ıa, and A. Plaza, “Deep learning in remote sensing image fusion: Methods, protocols, data, and future perspectives,”IEEE Geoscience and Remote Sensing Magazine, vol. 13, no. 1, pp. 269–310, 2025

  12. [12]

    Multi-scale transformer network for hyperspectral image denoising,

    S. Hu, Y . Hu, J. Lin, F. Gao, and J. Dong, “Multi-scale transformer network for hyperspectral image denoising,” inIGARSS 2023-2023 IEEE International Geoscience and Remote Sensing Symposium. IEEE, 2023, pp. 7328–7331

  13. [13]

    Recent advances of hyperspectral imaging technology and applications in agriculture,

    B. Lu, P. D. Dao, J. Liu, Y . He, and J. Shang, “Recent advances of hyperspectral imaging technology and applications in agriculture,” Remote Sensing, vol. 12, no. 16, p. 2659, 2020

  14. [14]

    Spectralformer: Rethinking hyperspectral image classification with transformers,

    D. Hong, Z. Han, J. Yao, L. Gao, B. Zhang, A. Plaza, and J. Chanus- sot, “Spectralformer: Rethinking hyperspectral image classification with transformers,”IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–15, 2021

  15. [15]

    Subtle spectral difference discriminative deep metric learning with spectral center construction for hyperspectral target detection,

    D. Zhu, Y . Lu, P. Zhong, B. Du, and L. Zhang, “Subtle spectral difference discriminative deep metric learning with spectral center construction for hyperspectral target detection,”IEEE Transactions on Geoscience and Remote Sensing, vol. 63, pp. 1–15, 2025

  16. [16]

    Learning single spectral abundance for hyperspectral subpixel target detection,

    D. Zhu, B. Du, and L. Zhang, “Learning single spectral abundance for hyperspectral subpixel target detection,”IEEE Transactions on Neural Networks and Learning Systems, vol. 35, no. 7, pp. 10 134–10 144, 2024

  17. [17]

    Convolutional neural networks for synthetic aperture radar classification,

    A. Profeta, A. Rodriguez, and H. S. Clouse, “Convolutional neural networks for synthetic aperture radar classification,” inAlgorithms for synthetic aperture radar imagery XXIII, vol. 9843. SPIE, 2016, pp. 185–194

  18. [18]

    Mapping up-to-date paddy rice extent at 10 m resolution in china through the integration of optical and synthetic aperture radar images,

    X. Zhang, B. Wu, G. E. Ponce-Campos, M. Zhang, S. Chang, and F. Tian, “Mapping up-to-date paddy rice extent at 10 m resolution in china through the integration of optical and synthetic aperture radar images,”Remote Sensing, vol. 10, no. 8, p. 1200, 2018

  19. [19]

    Classification of sar and polsar images using deep learning: A review,

    H. Parikh, S. Patel, and V . Patel, “Classification of sar and polsar images using deep learning: A review,”International Journal of Image and Data Fusion, vol. 11, no. 1, pp. 1–32, 2020

  20. [20]

    A comparative review of manifold learning techniques for hyperspectral and polarimetric sar image fusion,

    J. Hu, D. Hong, Y . Wang, and X. X. Zhu, “A comparative review of manifold learning techniques for hyperspectral and polarimetric sar image fusion,”Remote Sensing, vol. 11, no. 6, p. 681, 2019

  21. [21]

    Hyperspectral and sar image classification via recursive feature interactive fusion network,

    L. Lv, J. Lin, F. Gao, L. Qi, and J. Dong, “Hyperspectral and sar image classification via recursive feature interactive fusion network,” in IGARSS 2023-2023 IEEE International Geoscience and Remote Sensing Symposium. IEEE, 2023, pp. 6282–6285

  22. [22]

    Multimodal fusion transformer for remote sensing image classification,

    S. K. Roy, A. Deria, D. Hong, B. Rasti, A. Plaza, and J. Chanussot, “Multimodal fusion transformer for remote sensing image classification,” IEEE Transactions on Geoscience and Remote Sensing, vol. 61, pp. 1– 20, 2023

  23. [23]

    Single- stream cnn with learnable architecture for multisource remote sensing data,

    Y . Yang, D. Zhu, T. Qu, Q. Wang, F. Ren, and C. Cheng, “Single- stream cnn with learnable architecture for multisource remote sensing data,”IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–18, 2022

  24. [24]

    Feature exchange for multi- source data classification in wetland scene,

    Y . Gao, W. Li, M. Zhang, and R. Tao, “Feature exchange for multi- source data classification in wetland scene,” in2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS. IEEE, 2021, pp. 5382–5385

  25. [25]

    Mul- tisource joint representation learning fusion classification for remote sensing images,

    X. Geng, L. Jiao, L. Li, F. Liu, X. Liu, S. Yang, and X. Zhang, “Mul- tisource joint representation learning fusion classification for remote sensing images,”IEEE Transactions on Geoscience and Remote Sensing, 2023

  26. [26]

    Classi- fication of hyperspectral and lidar data using multi-modal transformer cascaded fusion net,

    S. Wang, C. Hou, Y . Chen, Z. Liu, Z. Zhang, and G. Zhang, “Classi- fication of hyperspectral and lidar data using multi-modal transformer cascaded fusion net,”Remote Sensing, vol. 15, no. 17, p. 4142, 2023

  27. [27]

    Intra-and inter-source interactive representation learning network for remote sensing images classification,

    W. Ma, Y . Guo, H. Zhu, X. Yi, W. Zhao, Y . Wu, B. Hou, and L. Jiao, “Intra-and inter-source interactive representation learning network for remote sensing images classification,”IEEE Transactions on Geoscience and Remote Sensing, 2024

  28. [28]

    Extended vision transformer (exvit) for land use and land cover classification: A mul- timodal deep learning framework,

    J. Yao, B. Zhang, C. Li, D. Hong, and J. Chanussot, “Extended vision transformer (exvit) for land use and land cover classification: A mul- timodal deep learning framework,”IEEE Transactions on Geoscience and Remote Sensing, vol. 61, pp. 1–15, 2023

  29. [29]

    Pca-based feature reduction for hyperspectral remote sensing image classification,

    M. P. Uddin, M. A. Mamun, and M. A. Hossain, “Pca-based feature reduction for hyperspectral remote sensing image classification,”IETE Technical Review, vol. 38, no. 4, pp. 377–396, 2021

  30. [30]

    Two-stream convolutional networks for hyperspectral target detection,

    D. Zhu, B. Du, and L. Zhang, “Two-stream convolutional networks for hyperspectral target detection,”IEEE Transactions on Geoscience and Remote Sensing, vol. 59, no. 8, pp. 6907–6921, 2021

  31. [31]

    Target detection with spatial- spectral adaptive sample generation and deep metric learning for hy- perspectral imagery,

    D. Zhu, B. Du, Y . Dong, and L. Zhang, “Target detection with spatial- spectral adaptive sample generation and deep metric learning for hy- perspectral imagery,”IEEE Transactions on Multimedia, vol. 25, pp. 6538–6550, 2023

  32. [32]

    A multi-scale pseudo-siamese network with an attention mechanism for classification of hyperspectral and lidar data,

    D. Song, J. Gao, B. Wang, and M. Wang, “A multi-scale pseudo-siamese network with an attention mechanism for classification of hyperspectral and lidar data,”Remote Sensing, vol. 15, no. 5, p. 1283, 2023

  33. [33]

    A 3 clnn: Spatial, spectral and multiscale attention convlstm neural network for multisource remote sensing data classification,

    H.-C. Li, W.-S. Hu, W. Li, J. Li, Q. Du, and A. Plaza, “A 3 clnn: Spatial, spectral and multiscale attention convlstm neural network for multisource remote sensing data classification,”IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 2, pp. 747–761, 2020

  34. [34]

    Dynamic frequency feature fusion network for multisource remote sensing data classification,

    Y . Zhao, F. Gao, X. Jin, J. Dong, and Q. Du, “Dynamic frequency feature fusion network for multisource remote sensing data classification,”IEEE Geoscience and Remote Sensing Letters, vol. 22, pp. 1–5, 2025. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING 14

  35. [35]

    Hyliosr: Staged progressive learning for joint open-set recognition of hyperspectral and lidar data,

    B. Xi, M. Cai, J. Li, Z. Wang, S. Feng, Y . Li, and J. Chanussot, “Hyliosr: Staged progressive learning for joint open-set recognition of hyperspectral and lidar data,”IEEE Transactions on Geoscience and Remote Sensing, 2025

  36. [36]

    A joint convolutional cross vit network for hyperspectral and light detection and ranging fusion classification,

    H. Xu, T. Zheng, Y . Liu, Z. Zhang, C. Xue, and J. Li, “A joint convolutional cross vit network for hyperspectral and light detection and ranging fusion classification,”Remote Sensing, vol. 16, no. 3, p. 489, 2024

  37. [37]

    Multi-feature cross attention- induced transformer network for hyperspectral and lidar data classifica- tion,

    Z. Li, R. Liu, L. Sun, and Y . Zheng, “Multi-feature cross attention- induced transformer network for hyperspectral and lidar data classifica- tion,”Remote Sensing, vol. 16, no. 15, p. 2775, 2024

  38. [38]

    Joint classification of hyperspectral and lidar data using binary-tree transformer network,

    H. Song, Y . Yang, X. Gao, M. Zhang, S. Li, B. Liu, Y . Wang, and Y . Kou, “Joint classification of hyperspectral and lidar data using binary-tree transformer network,”Remote Sensing, vol. 15, no. 11, p. 2706, 2023

  39. [39]

    Dynamic cross- modal feature interaction network for hyperspectral and lidar data classification,

    J. Lin, F. Gao, L. Qi, J. Dong, Q. Du, and X. Gao, “Dynamic cross- modal feature interaction network for hyperspectral and lidar data classification,”IEEE Transactions on Geoscience and Remote Sensing, vol. 63, pp. 1–16, 2025

  40. [40]

    Dynamic common and unique feature fusion network for hyperspectral and lidar data classification,

    C. Jiao, L. Wang, C. Hu, X. Tang, H. Zhu, and L. Jiao, “Dynamic common and unique feature fusion network for hyperspectral and lidar data classification,”IEEE Transactions on Geoscience and Remote Sensing, 2025

  41. [41]

    Learning a sparse transformer network for effective image deraining,

    X. Chen, H. Li, M. Li, and J. Pan, “Learning a sparse transformer network for effective image deraining,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 5896–5905

  42. [42]

    Multiscale locality and rank preservation for robust feature matching of remote sensing images,

    X. Jiang, J. Jiang, A. Fan, Z. Wang, and J. Ma, “Multiscale locality and rank preservation for robust feature matching of remote sensing images,” IEEE Transactions on Geoscience and Remote Sensing, vol. 57, no. 9, pp. 6462–6472, 2019

  43. [43]

    Flexattention for efficient high-resolution vision-language models,

    J. Li, D. Chen, T. Cai, P. Chen, Y . Hong, Z. Chen, Y . Shen, and C. Gan, “Flexattention for efficient high-resolution vision-language models,” in European Conference on Computer Vision. Springer, 2024, pp. 286– 302

  44. [44]

    Ttst: A top-k token selective transformer for remote sensing image super- resolution,

    Y . Xiao, Q. Yuan, K. Jiang, J. He, C.-W. Lin, and L. Zhang, “Ttst: A top-k token selective transformer for remote sensing image super- resolution,”IEEE Transactions on Image Processing, 2024

  45. [45]

    Scsa: A plug-and- play semantic continuous-sparse attention for arbitrary semantic style transfer,

    C. Shang, Z. Wang, H. Wang, and X. Meng, “Scsa: A plug-and- play semantic continuous-sparse attention for arbitrary semantic style transfer,” inProceedings of the Computer Vision and Pattern Recognition Conference, 2025, pp. 13 051–13 060

  46. [46]

    More diverse means better: Multimodal deep learning meets remote- sensing imagery classification,

    D. Hong, L. Gao, N. Yokoya, J. Yao, J. Chanussot, Q. Du, and B. Zhang, “More diverse means better: Multimodal deep learning meets remote- sensing imagery classification,”IEEE Transactions on Geoscience and Remote Sensing, vol. 59, no. 5, pp. 4340–4354, 2021

  47. [47]

    Mdas: a new multimodal benchmark dataset for remote sensing,

    J. Hu, R. Liu, D. Hong, A. Camero, J. Yao, M. Schneider, F. Kurz, K. Segl, and X. X. Zhu, “Mdas: a new multimodal benchmark dataset for remote sensing,”Earth System Science Data, vol. 15, no. 1, pp. 113–131, 2023

  48. [48]

    Berlin-urban-gradient dataset 2009-an enmap preparatory flight campaign,

    A. Okujeni, S. van der Linden, and P. Hostert, “Berlin-urban-gradient dataset 2009-an enmap preparatory flight campaign,” 2016

  49. [49]

    Hyperspectral and lidar data fusion: Outcome of the 2013 grss data fusion contest,

    C. Debes, A. Merentitis, R. Heremans, J. Hahn, N. Frangiadakis, T. van Kasteren, W. Liao, R. Bellens, A. Pi ˇzurica, S. Gautamaet al., “Hyperspectral and lidar data fusion: Outcome of the 2013 grss data fusion contest,”IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 7, no. 6, pp. 2405–2418, 2014

  50. [50]

    Fusatnet: Dual at- tention based spectrospatial multimodal fusion network for hyperspectral and lidar classification,

    S. Mohla, S. Pande, B. Banerjee, and S. Chaudhuri, “Fusatnet: Dual at- tention based spectrospatial multimodal fusion network for hyperspectral and lidar classification,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020, pp. 92–93

  51. [51]

    S 2enet: Spatial–spectral cross-modal en- hancement network for classification of hyperspectral and lidar data,

    S. Fang, K. Li, and Z. Li, “S 2enet: Spatial–spectral cross-modal en- hancement network for classification of hyperspectral and lidar data,” IEEE Geoscience and Remote Sensing Letters, vol. 19, pp. 1–5, 2021

  52. [52]

    Asymmetric feature fusion network for hyperspectral and sar image classification,

    W. Li, Y . Gao, M. Zhang, R. Tao, and Q. Du, “Asymmetric feature fusion network for hyperspectral and sar image classification,”IEEE Transactions on Neural Networks and Learning Systems, vol. 34, no. 10, pp. 8057–8070, 2022

  53. [53]

    Joint classification of hyperspectral and lidar data using a hierarchical cnn and transformer,

    G. Zhao, Q. Ye, L. Sun, Z. Wu, C. Pan, and B. Jeon, “Joint classification of hyperspectral and lidar data using a hierarchical cnn and transformer,” IEEE Transactions on Geoscience and Remote Sensing, vol. 61, pp. 1– 16, 2022

  54. [54]

    Mixing self-attention and convolution: A unified framework for multi-source remote sensing data classification,

    K. Li, D. Wang, X. Wang, G. Liu, Z. Wu, and Q. Wang, “Mixing self-attention and convolution: A unified framework for multi-source remote sensing data classification,”IEEE Transactions on Geoscience and Remote Sensing, 2023

  55. [55]

    Multiple informa- tion collaborative fusion network for joint classification of hyperspectral and lidar data,

    X. Tang, Y . Zou, J. Ma, X. Zhang, F. Liu, and L. Jiao, “Multiple informa- tion collaborative fusion network for joint classification of hyperspectral and lidar data,”IEEE Transactions on Geoscience and Remote Sensing, 2024

  56. [56]

    Global clue-guided cross-memory quaternion transformer network for multisource remote sensing data classification,

    W.-S. Hu, W. Li, H.-C. Li, F.-H. Huang, and R. Tao, “Global clue-guided cross-memory quaternion transformer network for multisource remote sensing data classification,”IEEE Transactions on Neural Networks and Learning Systems, vol. 36, no. 4, pp. 7357–7371, 2025

  57. [57]

    Coarse-to- fine high-order network for hyperspectral and lidar classification,

    K. Ni, Y . Xie, G. Zhao, Z. Zheng, P. Wang, and T. Lu, “Coarse-to- fine high-order network for hyperspectral and lidar classification,”IEEE Transactions on Geoscience and Remote Sensing, vol. 63, pp. 1–16, 2025

  58. [58]

    Mgmnet: Mutual-guidance mechanism for joint classification of multisource remote sensing data,

    W. Zhang, W. Song, J. Wang, and W. Gao, “Mgmnet: Mutual-guidance mechanism for joint classification of multisource remote sensing data,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 18, pp. 1085–1097, 2025

  59. [59]

    Visualizing data using t-sne,

    L. v. d. Maaten and G. Hinton, “Visualizing data using t-sne,”Journal of machine learning research, vol. 9, no. Nov, pp. 2579–2605, 2008

  60. [60]

    Band selection using improved sparse subspace clustering for hyperspectral imagery classifi- cation,

    W. Sun, L. Zhang, B. Du, W. Li, and Y . M. Lai, “Band selection using improved sparse subspace clustering for hyperspectral imagery classifi- cation,”IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 8, no. 6, pp. 2784–2797, 2015

  61. [61]

    A geometry-based band selection approach for hyperspectral image analysis,

    W. Zhang, X. Li, Y . Dou, and L. Zhao, “A geometry-based band selection approach for hyperspectral image analysis,”IEEE transactions on geoscience and remote sensing, vol. 56, no. 8, pp. 4318–4333, 2018

  62. [62]

    Bs-nets: An end-to-end framework for band selection of hyperspectral image,

    Y . Cai, X. Liu, and Z. Cai, “Bs-nets: An end-to-end framework for band selection of hyperspectral image,”IEEE transactions on geoscience and remote sensing, vol. 58, no. 3, pp. 1969–1984, 2019

  63. [63]

    Srl-soa: Self- representation learning with sparse 1d-operational autoencoder for hy- perspectral image band selection,

    M. Ahishali, S. Kiranyaz, I. Ahmad, and M. Gabbouj, “Srl-soa: Self- representation learning with sparse 1d-operational autoencoder for hy- perspectral image band selection,” in2022 IEEE International Confer- ence on Image Processing (ICIP). IEEE, 2022, pp. 2296–2300

  64. [64]

    Lidar- guided cross-attention fusion for hyperspectral band selection and image classification,

    J. X. Yang, J. Zhou, J. Wang, H. Tian, and A. W.-C. Liew, “Lidar- guided cross-attention fusion for hyperspectral band selection and image classification,”IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1–15, 2024