pith. machine review for the scientific record. sign in

arxiv: 2604.14540 · v1 · submitted 2026-04-16 · 💻 cs.CV

Recognition: unknown

WILD-SAM: Phase-Aware Expert Adaptation of SAM for Landslide Detection in Wrapped InSAR Interferograms

Authors on Pith no claims yet

Pith reviewed 2026-05-10 11:49 UTC · model grok-4.3

classification 💻 cs.CV
keywords landslide detectionwrapped InSARSAM adaptationphase-aware adaptermixture of expertswavelet enhancementremote sensing segmentation
0
0 comments X

The pith

WILD-SAM adapts SAM with phase-aware experts and wavelet prompts to detect landslides accurately from wrapped InSAR interferograms.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tries to establish that inserting a Phase-Aware Mixture-of-Experts Adapter into SAM's frozen encoder and adding a Wavelet-Guided Subband Enhancement strategy can align the spectral mismatch between natural images and wrapped phase data. This alignment preserves the high-frequency fringes needed to outline landslide boundaries despite phase ambiguity and coherence noise. If correct, the method would allow reliable detection of slow-moving landslides directly from wrapped interferograms, avoiding the errors and costs of phase unwrapping.

Core claim

WILD-SAM integrates a Phase-Aware Mixture-of-Experts Adapter into the frozen SAM encoder to adaptively align spectral distributions between natural images and interferometric phase data through dynamic routing across convolutional experts, while the Wavelet-Guided Subband Enhancement strategy uses discrete wavelet transforms to disentangle high-frequency subbands and inject refined phase textures as dense prompts, ensuring topological integrity along sharp landslide boundaries and delivering state-of-the-art target completeness and contour fidelity on the ISSLIDE and ISSLIDE+ benchmarks.

What carries the argument

Phase-Aware Mixture-of-Experts (PA-MoE) Adapter that routes across heterogeneous convolutional experts to aggregate multi-scale spectral-textural priors, combined with Wavelet-Guided Subband Enhancement (WGSE) that generates frequency-aware dense prompts from high-frequency subbands.

Load-bearing premise

The spectral domain shift between natural images and wrapped interferometric phase data can be bridged by the PA-MoE Adapter and WGSE strategy while preserving the high-frequency fringes needed for accurate boundary delineation.

What would settle it

A direct comparison on held-out wrapped InSAR datasets where WILD-SAM shows no improvement in boundary precision or completeness over unmodified SAM or conventional segmentation baselines would falsify the central claim.

Figures

Figures reproduced from arXiv: 2604.14540 by Bin Pan, Heping Li, Sajid Hussain, Yucheng Pan, Zhangle Liu.

Figure 1
Figure 1. Figure 1: Illustration of the domain gap between interferometric and natural data. [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overview of the proposed WILD-SAM architecture. The framework takes a three-channel phase representation as input and consists of a frozen ViT [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Details of our Convolutional Routing Experts (CORE) module. [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Wavelet-Domain Feature Rectifier (WDFR) and SE Gate. The high [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Qualitative comparison of segmentation results on the ISSLIDE [ [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Qualitative comparison of segmentation results on the ISSLIDE+ [ [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Quantitative comparison of cross-region generalization performance on the Hunza-InSAR [ [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Cross-region qualitative evaluation on the external Hunza-InSAR [ [PITH_FULL_IMAGE:figures/full_fig_p009_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Qualitative ablation results on the ISSLIDE [ [PITH_FULL_IMAGE:figures/full_fig_p010_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Visualization of feature attention maps input to the Mask Decoder. [PITH_FULL_IMAGE:figures/full_fig_p011_10.png] view at source ↗
read the original abstract

Detecting slow-moving landslides directly from wrapped Interferometric Synthetic Aperture Radar (InSAR) interferograms is crucial for efficient geohazard monitoring, yet it remains fundamentally challenged by severe phase ambiguity and complex coherence noise. While the Segment Anything Model (SAM) offers a powerful foundation for segmentation, its direct transfer to wrapped phase data is hindered by a profound spectral domain shift, which suppresses the high-frequency fringes essential for boundary delineation. To bridge this gap, we propose WILD-SAM, a novel parameter-efficient fine-tuning framework specifically designed to adapt SAM for high-precision landslide detection on wrapped interferograms. Specifically, the architecture integrates a Phase-Aware Mixture-of-Experts (PA-MoE) Adapter into the frozen encoder to align spectral distributions and introduces a Wavelet-Guided Subband Enhancement (WGSE) strategy to generate frequency-aware dense prompts. The PA-MoE Adapter exploits a dynamic routing mechanism across heterogeneous convolutional experts to adaptively aggregate multi-scale spectral-textural priors, effectively aligning the distribution discrepancy between natural images and interferometric phase data. Meanwhile, the WGSE strategy leverages discrete wavelet transforms to explicitly disentangle high-frequency subbands and refine directional phase textures, injecting these structural cues as dense prompts to ensure topological integrity along sharp landslide boundaries. Extensive experiments on the ISSLIDE and ISSLIDE+ benchmarks demonstrate that WILD-SAM achieves state-of-the-art performance, significantly outperforming existing methods in both target completeness and contour fidelity.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes WILD-SAM, a parameter-efficient fine-tuning framework adapting the Segment Anything Model (SAM) for landslide detection directly from wrapped InSAR interferograms. It integrates a Phase-Aware Mixture-of-Experts (PA-MoE) Adapter into the frozen encoder for spectral alignment via dynamic routing across convolutional experts and introduces a Wavelet-Guided Subband Enhancement (WGSE) strategy that applies discrete wavelet transforms to disentangle and refine high-frequency subbands, injecting them as dense prompts. The paper claims this yields state-of-the-art performance on the ISSLIDE and ISSLIDE+ benchmarks, with gains in target completeness and contour fidelity over prior methods.

Significance. If the experimental claims hold with proper validation, the work could meaningfully advance operational geohazard monitoring by enabling segmentation on wrapped phase data without error-prone unwrapping steps. The targeted use of wavelet subband refinement and MoE-based domain adaptation for preserving fringe structures in noisy interferograms addresses a genuine spectral mismatch problem in remote-sensing foundation-model transfer. The parameter-efficient design is a practical strength for deployment on limited InSAR datasets.

major comments (2)
  1. [WGSE description and experimental validation] The central SOTA claim on contour fidelity rests on the WGSE strategy's ability to preserve high-frequency phase fringes, yet the manuscript supplies no frequency-domain diagnostics (e.g., power-spectrum ratios before/after WGSE, fringe-visibility scores, or edge-preservation metrics) comparing input interferograms to WGSE outputs. This leaves open whether boundary improvements arise from true high-frequency recovery or from the PA-MoE adapter and prompt engineering alone.
  2. [Abstract and §4 (Experiments)] The abstract asserts SOTA results on named benchmarks but supplies no quantitative numbers, error bars, ablation studies, or experimental protocol details. Without these, the strength of the performance claims cannot be evaluated.
minor comments (1)
  1. [Method section] Notation for the PA-MoE routing mechanism and wavelet subband indices could be clarified with explicit equations or a diagram to aid reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive review. We address each major comment below and outline the revisions we will make to improve the manuscript.

read point-by-point responses
  1. Referee: [WGSE description and experimental validation] The central SOTA claim on contour fidelity rests on the WGSE strategy's ability to preserve high-frequency phase fringes, yet the manuscript supplies no frequency-domain diagnostics (e.g., power-spectrum ratios before/after WGSE, fringe-visibility scores, or edge-preservation metrics) comparing input interferograms to WGSE outputs. This leaves open whether boundary improvements arise from true high-frequency recovery or from the PA-MoE adapter and prompt engineering alone.

    Authors: We appreciate this observation on the need for direct validation of WGSE's frequency-domain effects. The manuscript describes how WGSE uses discrete wavelet transforms to disentangle high-frequency subbands and inject refined phase textures as dense prompts. However, we acknowledge that explicit diagnostics such as power-spectrum ratios, fringe-visibility scores, or edge-preservation metrics comparing inputs to WGSE outputs are not currently provided. In the revised manuscript we will add these analyses, including before/after comparisons on sample interferograms, to demonstrate high-frequency recovery and to help isolate WGSE's contribution from that of the PA-MoE adapter and prompt engineering. revision: yes

  2. Referee: [Abstract and §4 (Experiments)] The abstract asserts SOTA results on named benchmarks but supplies no quantitative numbers, error bars, ablation studies, or experimental protocol details. Without these, the strength of the performance claims cannot be evaluated.

    Authors: We agree that the current abstract and experimental section do not supply the requested quantitative details. The manuscript states that WILD-SAM achieves state-of-the-art results on ISSLIDE and ISSLIDE+ but does not include specific metrics, error bars, or protocol information in the abstract, and §4 lacks comprehensive ablation tables and repeated-run statistics. We will revise the abstract to include key quantitative results (e.g., IoU and F1 improvements with error bars) and a brief protocol summary. We will also expand §4 with full ablation studies, experimental protocol details, and error bars from multiple runs to allow proper evaluation of the performance claims. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper introduces an architectural adaptation of SAM via the PA-MoE Adapter and WGSE strategy for spectral alignment and frequency-aware prompting on wrapped InSAR data. No equations, fitted parameters, or performance metrics are defined in terms of themselves or prior self-citations in the abstract or method description. The SOTA claims rest on benchmark experiments rather than any self-definitional reduction, fitted-input prediction, or load-bearing self-citation chain. The derivation chain is self-contained as an independent fine-tuning framework without tautological constructions.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

Assessment is limited to the abstract; no numerical free parameters or external benchmarks are described. The ledger records the two explicitly introduced components and the background transfer-learning assumption.

axioms (1)
  • domain assumption The Segment Anything Model provides a transferable foundation for segmentation that can be adapted to new domains via parameter-efficient modules.
    This premise underpins the decision to start from frozen SAM rather than training from scratch.
invented entities (2)
  • Phase-Aware Mixture-of-Experts (PA-MoE) Adapter no independent evidence
    purpose: Dynamically route and aggregate multi-scale spectral-textural features to align natural-image and interferometric-phase distributions.
    New module introduced inside the encoder; no independent evidence outside this work is supplied.
  • Wavelet-Guided Subband Enhancement (WGSE) strategy no independent evidence
    purpose: Disentangle high-frequency subbands via discrete wavelet transform and inject them as dense prompts to preserve boundary topology.
    New prompting strategy proposed in the paper; no external validation is given.

pith-pipeline@v0.9.0 · 5576 in / 1453 out tokens · 64303 ms · 2026-05-10T11:49:01.435330+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

71 extracted references · 5 canonical work pages · 3 internal anchors

  1. [1]

    Sar monitoring of progressive and seasonal ground deformation using the permanent scatterers technique,

    C. Colesanti, A. Ferretti, F. Novali, C. Prati, and F. Rocca, “Sar monitoring of progressive and seasonal ground deformation using the permanent scatterers technique,”IEEE Transactions on Geoscience and Remote Sensing, vol. 41, no. 7, pp. 1685–1701, 2003

  2. [2]

    A small-baseline approach for investigating deformations on full-resolution differential sar interferograms,

    R. Lanari, O. Mora, M. Manunta, J. Mallorqui, P. Berardino, and E. Sansosti, “A small-baseline approach for investigating deformations on full-resolution differential sar interferograms,”IEEE Transactions on Geoscience and Remote Sensing, vol. 42, no. 7, pp. 1377–1386, 2004

  3. [3]

    A new algorithm for processing interferometric data-stacks: Squeesar,

    A. Ferretti, A. Fumagalli, F. Novali, C. Prati, F. Rocca, and A. Rucci, “A new algorithm for processing interferometric data-stacks: Squeesar,” IEEE Transactions on Geoscience and Remote Sensing, vol. 49, no. 9, pp. 3460–3470, 2011

  4. [4]

    Automatic detection and update of landslide inventory before and after impoundments at the lianghekou reservoir using sentinel-1 insar,

    Y . Wang, J. Dong, L. Zhang, S. Deng, G. Zhang, M. Liao, and J. Gong, “Automatic detection and update of landslide inventory before and after impoundments at the lianghekou reservoir using sentinel-1 insar,”International Journal of Applied Earth Observation and Geoinformation, vol. 118, p. 103224, 2023. [Online]. Available: https://www.sciencedirect.com/s...

  5. [5]

    Landslide detection in long-term and low-coherence scenario using faster intermittent stacking insar method,

    H. Dai, L. Wu, Y . Liao, L. Chen, and Y . Yang, “Landslide detection in long-term and low-coherence scenario using faster intermittent stacking insar method,”IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 18, pp. 19 245–19 259, 2025

  6. [6]

    Remote sensing-based detection and analysis of slow-moving landslides in aba prefecture, southwest china,

    J. Ren, W. Yang, Z. Ma, W. Li, S. Zeng, H. Fu, Y . Wen, and J. He, “Remote sensing-based detection and analysis of slow-moving landslides in aba prefecture, southwest china,”Remote Sensing, vol. 17, no. 8,

  7. [7]

    Available: https://www.mdpi.com/2072-4292/17/8/1462

    [Online]. Available: https://www.mdpi.com/2072-4292/17/8/1462

  8. [8]

    An embedding swin transformer model for automatic slow-moving landslide detection based on insar products,

    X. Chen, C. Zhao, X. Liu, S. Zhang, J. Xi, and B. A. Khan, “An embedding swin transformer model for automatic slow-moving landslide detection based on insar products,”IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1–15, 2024

  9. [9]

    A new deep learning neural network model for the identification of insar anomalous deformation areas,

    T. Zhang, W. Zhang, D. Cao, Y . Yi, and X. Wu, “A new deep learning neural network model for the identification of insar anomalous deformation areas,”Remote Sensing, vol. 14, no. 11, 2022. [Online]. Available: https://www.mdpi.com/2072-4292/14/11/2690

  10. [10]

    Automatic identification of active landslides over wide areas from time-series insar measurements using faster rcnn,

    J. Cai, L. Zhang, J. Dong, J. Guo, Y . Wang, and M. Liao, “Automatic identification of active landslides over wide areas from time-series insar measurements using faster rcnn,”International Journal of Applied Earth Observation and Geoinformation, vol. 124, p. 103516, 2023. [Online]. Available: https://www.sciencedirect.com/science/article/pii/ S1569843223003400

  11. [11]

    Drs-unet: A deep semantic segmentation network for the recognition of active landslides from insar imagery in the three rivers region of the qinghai–tibet plateau,

    X. Chen, X. Yao, Z. Zhou, Y . Liu, C. Yao, and K. Ren, “Drs-unet: A deep semantic segmentation network for the recognition of active landslides from insar imagery in the three rivers region of the qinghai–tibet plateau,”Remote Sensing, vol. 14, no. 8, 2022. [Online]. Available: https://www.mdpi.com/2072-4292/14/8/1848

  12. [12]

    A lightweight context-aware adaptive fusion network for automatic identification of active landslides,

    X. Cai, C. Song, Z. Li, Y . Chen, B. Chen, J. Du, C. Yu, W. Zhu, and J. Peng, “A lightweight context-aware adaptive fusion network for automatic identification of active landslides,”International Journal of Applied Earth Observation and Geoinformation, vol. 144, p. 104882, 2025. [Online]. Available: https://www.sciencedirect.com/ science/article/pii/S1569...

  13. [13]

    Change detection of slow-moving landslide with multi- source sbas-insar and light-u2net,

    J. Cai, D. Ming, F. Liu, X. Ling, N. Liu, L. Zhang, L. Xu, Y . Li, and M. Zhu, “Change detection of slow-moving landslide with multi- source sbas-insar and light-u2net,”International Journal of Applied Earth Observation and Geoinformation, vol. 136, p. 104387, 2025. [Online]. Available: https://www.sciencedirect.com/science/article/pii/ S1569843225000342

  14. [14]

    Zero-shot detection for insar-based land displacement by the deformation-prompt-based sam method,

    Y . He, B. Chen, M. Motagh, Y . Zhu, S. Shao, J. Li, B. Zhang, and H. Kaufmann, “Zero-shot detection for insar-based land displacement by the deformation-prompt-based sam method,”International Journal of Applied Earth Observation and Geoinformation, vol. 136, p. 104407, 2025. [Online]. Available: https://www.sciencedirect.com/ science/article/pii/S1569843...

  15. [15]

    Segment anything,

    A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W.-Y . Lo, P. Doll ´ar, and R. B. Girshick, “Segment anything,”2023 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 3992–4003, 2023. [Online]. Available: https://api.semanticscholar.org/CorpusID:257952310

  16. [16]

    Isslide: A new insar dataset for slow sliding area detection with machine learning,

    A. Bralet, E. Trouv ´e, J. Chanussot, and A. M. Atto, “Isslide: A new insar dataset for slow sliding area detection with machine learning,” IEEE Geoscience and Remote Sensing Letters, vol. 21, pp. 1–5, 2024

  17. [17]

    Mb-net: A network for accurately identifying creeping landslides from wrapped interferograms,

    R. Zhang, W. Zhu, B. Fan, Q. He, J. Zhan, C. Wang, and B. Zhang, “Mb-net: A network for accurately identifying creeping landslides from wrapped interferograms,”International Journal of Applied Earth Observation and Geoinformation, vol. 135, p. 104300, 2024. [Online]. Available: https://www.sciencedirect.com/science/article/pii/ S1569843224006587

  18. [18]

    Ecsplain: Explainability-constrained classifier for pairing the detection and the lo- calization of moving areas from sar interferograms,

    A. Bralet, A. M. Atto, J. Chanussot, and E. Trouv ´e, “Ecsplain: Explainability-constrained classifier for pairing the detection and the lo- calization of moving areas from sar interferograms,”IEEE Transactions on Geoscience and Remote Sensing, vol. 63, pp. 1–18, 2025

  19. [19]

    Medical sam adapter: Adapting segment anything model for medical image segmentation,

    J. Wu, Z. Wang, M. Hong, W. Ji, H. Fu, Y . Xu, M. Xu, and Y . Jin, “Medical sam adapter: Adapting segment anything model for medical image segmentation,”Medical Image Analysis, vol. 102, p. 103547, 2025. [Online]. Available: https://www.sciencedirect.com/ science/article/pii/S1361841525000945

  20. [20]

    Samrs: Scaling-up remote sensing segmentation dataset with segment anything model,

    D. Wang, J. Zhang, B. Du, M. Xu, L. Liu, D. Tao, and L. Zhang, “Samrs: Scaling-up remote sensing segmentation dataset with segment anything model,” inAdvances in Neural Information Processing Systems, vol. 36, 2023, pp. 8815–8827

  21. [21]

    Segment anything meets point tracking,

    F. Raji ˇc, L. Ke, Y . Tai, C. Tang, M. Danelljan, and F. Yu, “Segment anything meets point tracking,” in2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2025, pp. 9302–9311

  22. [22]

    ASTRA: Let Arbitrary Subjects Transform in Video Editing

    F. Shen, W. Xu, R. Yan, D. Zhang, X. Shu, and J. Tang, “Imagedit: Let any subject transform,”arXiv preprint arXiv:2510.01186, 2025

  23. [23]

    IMAGHar- mony: Controllable image editing with consistent object quantity and layout,

    F. Shen, X. Du, Y . Gao, J. Yu, Y . Cao, X. Lei, and J. Tang, “Imaghar- mony: Controllable image editing with consistent object quantity and layout,”arXiv preprint arXiv:2506.01949, 2025

  24. [24]

    Imagpose: A unified conditional framework for pose-guided person generation,

    F. Shen and J. Tang, “Imagpose: A unified conditional framework for pose-guided person generation,”Advances in neural information processing systems, vol. 37, pp. 6246–6266, 2024

  25. [25]

    The segment anything model (sam) for remote sensing applications: From zero to one shot,

    L. P. Osco, Q. Wu, E. L. de Lemos, W. N. Gonc ¸alves, A. P. M. Ramos, J. Li, and J. Marcato, “The segment anything model (sam) for remote sensing applications: From zero to one shot,”International Journal of Applied Earth Observation and Geoinformation, vol. 124, p. 103540, 2023. [Online]. Available: https://www.sciencedirect.com/ science/article/pii/S156...

  26. [26]

    Parameter-efficient transfer learning for NLP,

    N. Houlsby, A. Giurgiu, S. Jastrzebski, B. Morrone, Q. De Laroussilhe, A. Gesmundo, M. Attariyan, and S. Gelly, “Parameter-efficient transfer learning for NLP,” inProceedings of the 36th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, K. Chaudhuri and R. Salakhutdinov, Eds., vol. 97. PMLR, 09–15 Jun 2019, pp. 2...

  27. [27]

    LoRA: Low-rank adaptation of large language models,

    E. J. Hu, Y . Shen, P. Wallis, Z. Allen-Zhu, Y . Li, S. Wang, L. Wang, and W. Chen, “LoRA: Low-rank adaptation of large language models,” in International Conference on Learning Representations, 2022. [Online]. Available: https://openreview.net/forum?id=nZeVKeeFYf9

  28. [28]

    Multi-lora fine-tuned segment anything model for urban man-made object extraction,

    X. Lu and Q. Weng, “Multi-lora fine-tuned segment anything model for urban man-made object extraction,”IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1–19, 2024

  29. [29]

    A multispectral remote sensing crop segmentation method based on seg- ment anything model using multistage adaptation fine-tuning,

    B. Song, H. Yang, Y . Wu, P. Zhang, B. Wang, and G. Han, “A multispectral remote sensing crop segmentation method based on seg- ment anything model using multistage adaptation fine-tuning,”IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1–18, 2024

  30. [30]

    Sam-adapter: Adapting segment anything in underperformed scenes,

    T. Chen, L. Zhu, C. Deng, R. Cao, Y . Wang, S. Zhang, Z. Li, L. Sun, Y . Zang, and P. Mao, “Sam-adapter: Adapting segment anything in underperformed scenes,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 3367–3375

  31. [31]

    Rs-sam: Integrating multi-scale information for enhanced remote sensing image segmentation,

    E. Zhang, J. Liu, A. Cao, Z. Sun, H. Zhang, H. Wang, L. Sun, and M. Song, “Rs-sam: Integrating multi-scale information for enhanced remote sensing image segmentation,” inComputer Vision – ACCV 2024, M. Cho, I. Laptev, D. Tran, A. Yao, and H. Zha, Eds. Singapore: Springer Nature Singapore, 2025, pp. 280–296

  32. [32]

    Scd-sam: Adapting segment anything model for semantic change detection in remote sensing imagery,

    L. Mei, Z. Ye, C. Xu, H. Wang, Y . Wang, C. Lei, W. Yang, and Y . Li, “Scd-sam: Adapting segment anything model for semantic change detection in remote sensing imagery,”IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1–13, 2024

  33. [33]

    A universal adapter in segmentation models for transferable landslide mapping,

    R. Wei, Y . Li, Y . Li, B. Zhang, J. Wang, C. Wu, S. Yao, and C. Ye, “A universal adapter in segmentation models for transferable landslide mapping,”ISPRS Journal of Photogrammetry and Remote Sensing, vol. 218, pp. 446–465, 2024. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0924271624004143 13

  34. [34]

    Classwise-sam-adapter: Parameter-efficient fine-tuning adapts segment anything to sar domain for semantic segmentation,

    X. Pu, H. Jia, L. Zheng, F. Wang, and F. Xu, “Classwise-sam-adapter: Parameter-efficient fine-tuning adapts segment anything to sar domain for semantic segmentation,”IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 18, pp. 4791–4804, 2025

  35. [35]

    A vision foundation model-based method for large-scale forest disturbance mapping using time series sentinel-1 sar data,

    Y . Tian, F. Zhao, R. Meng, R. Sun, Y . Zhang, Y . Shen, B. Wang, J. Liu, and M. Li, “A vision foundation model-based method for large-scale forest disturbance mapping using time series sentinel-1 sar data,”Remote Sensing of Environment, vol. 325, p. 114775, 2025. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0034425725001798

  36. [36]

    Mesam: Multiscale enhanced segment anything model for optical remote sensing images,

    X. Zhou, F. Liang, L. Chen, H. Liu, Q. Song, G. Vivone, and J. Chanus- sot, “Mesam: Multiscale enhanced segment anything model for optical remote sensing images,”IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1–15, 2024

  37. [37]

    Bsdsnet: Dual-stream feature extraction network based on segment anything model for synthetic aperture radar land cover classification,

    Y . Wang, W. Zhang, W. Chen, and C. Chen, “Bsdsnet: Dual-stream feature extraction network based on segment anything model for synthetic aperture radar land cover classification,”Remote Sensing, vol. 16, no. 7, 2024. [Online]. Available: https://www.mdpi.com/ 2072-4292/16/7/1150

  38. [38]

    Sam-cffnet: Sam-based cross-feature fusion network for intelligent identification of landslides,

    L. Xi, J. Yu, D. Ge, Y . Pang, P. Zhou, C. Hou, Y . Li, Y . Chen, and Y . Dong, “Sam-cffnet: Sam-based cross-feature fusion network for intelligent identification of landslides,”Remote Sensing, vol. 16, no. 13, 2024. [Online]. Available: https://www.mdpi.com/2072-4292/ 16/13/2334

  39. [39]

    Adaptive mixtures of local experts,

    R. A. Jacobs, M. I. Jordan, S. J. Nowlan, and G. E. Hinton, “Adaptive mixtures of local experts,”Neural Computation, vol. 3, no. 1, pp. 79–87, 1991

  40. [40]

    Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer

    N. Shazeer, A. Mirhoseini, K. Maziarz, A. Davis, Q. V . Le, G. E. Hinton, and J. Dean, “Outrageously large neural networks: The sparsely-gated mixture-of-experts layer,”ArXiv, vol. abs/1701.06538, 2017. [Online]. Available: https://api.semanticscholar.org/CorpusID:12462234

  41. [41]

    Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity,

    W. Fedus, B. Zoph, and N. Shazeer, “Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity,”Journal of Machine Learning Research, vol. 23, no. 120, pp. 1–39, 2022

  42. [42]

    Mixture-of-experts with expert choice routing,

    Y . Zhou, T. Lei, H. Liu, N. Du, Y . Huang, V . Zhao, A. M. Dai, z. Chen, Q. V . Le, and J. Laudon, “Mixture-of-experts with expert choice routing,” inAdvances in Neural Information Processing Systems, S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, Eds., vol. 35. Curran Associates, Inc., 2022, pp. 7103–7114. [Online]. Available: https:...

  43. [43]

    Tutel: Adaptive mixture-of-experts at scale,

    C. Hwang, W. Cui, Y . Xiong, Z. Yang, Z. Liu, H. Hu, Z. Wang, R. Salas, J. Jose, P. Ramet al., “Tutel: Adaptive mixture-of-experts at scale,” Proceedings of Machine Learning and Systems, vol. 5, pp. 269–287, 2023

  44. [44]

    Scaling vision with sparse mixture of experts,

    C. Riquelme, J. Puigcerver, B. Mustafa, M. Neumann, R. Jenatton, A. Susano Pinto, D. Keysers, and N. Houlsby, “Scaling vision with sparse mixture of experts,”Advances in Neural Information Processing Systems, vol. 34, pp. 8583–8595, 2021

  45. [45]

    Patch-level routing in mixture-of-experts is provably sample-efficient for convolutional neural networks,

    M. N. R. Chowdhury, S. Zhang, M. Wang, S. Liu, and P.-Y . Chen, “Patch-level routing in mixture-of-experts is provably sample-efficient for convolutional neural networks,” inInternational Conference on Machine Learning. PMLR, 2023, pp. 6074–6114

  46. [46]

    From sparse to soft mixtures of experts,

    J. Puigcerver, C. R. Ruiz, B. Mustafa, and N. Houlsby, “From sparse to soft mixtures of experts,” inThe Twelfth International Conference on Learning Representations, 2024. [Online]. Available: https://openreview.net/forum?id=jxpsAj7ltE

  47. [47]

    DeepSeekMoE: Towards ultimate expert specialization in mixture-of-experts language models,

    D. Dai, C. Deng, C. Zhao, R. Xu, H. Gao, D. Chen, J. Li, W. Zeng, X. Yu, Y . Wu, Z. Xie, Y . Li, P. Huang, F. Luo, C. Ruan, Z. Sui, and W. Liang, “DeepSeekMoE: Towards ultimate expert specialization in mixture-of-experts language models,” in Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), L....

  48. [48]

    Sparse upcycling: Training mixture-of-experts from dense checkpoints,

    A. Komatsuzaki, J. Puigcerver, J. Lee-Thorp, C. R. Ruiz, B. Mustafa, J. Ainslie, Y . Tay, M. Dehghani, and N. Houlsby, “Sparse upcycling: Training mixture-of-experts from dense checkpoints,” inThe Eleventh International Conference on Learning Representations, 2023. [Online]. Available: https://openreview.net/forum?id=T5nUQDrM4u

  49. [49]

    LoRAMoE: Alleviating world knowledge forgetting in large language models via MoE-style plugin,

    S. Dou, E. Zhou, Y . Liu, S. Gao, W. Shen, L. Xiong, Y . Zhou, X. Wang, Z. Xi, X. Fan, S. Pu, J. Zhu, R. Zheng, T. Gui, Q. Zhang, and X. Huang, “LoRAMoE: Alleviating world knowledge forgetting in large language models via MoE-style plugin,” in Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ...

  50. [50]

    Dselect-k: Differentiable selection in the mixture of experts with applications to multi-task learning,

    H. Hazimeh, Z. Zhao, A. Chowdhery, M. Sathiamoorthy, Y . Chen, R. Mazumder, L. Hong, and E. Chi, “Dselect-k: Differentiable selection in the mixture of experts with applications to multi-task learning,”Ad- vances in Neural Information Processing Systems, vol. 34, pp. 29 335– 29 347, 2021

  51. [51]

    Robust mixture-of-expert training for convolu- tional neural networks,

    Y . Zhang, R. Cai, T. Chen, G. Zhang, H. Zhang, P.-Y . Chen, S. Chang, Z. Wang, and S. Liu, “Robust mixture-of-expert training for convolu- tional neural networks,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 90–101

  52. [52]

    Boost- ing consistency in story visualization with rich-contextual conditional diffusion models,

    F. Shen, H. Ye, S. Liu, J. Zhang, C. Wang, X. Han, and Y . Wei, “Boost- ing consistency in story visualization with rich-contextual conditional diffusion models,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 39, no. 7, 2025, pp. 6785–6794

  53. [53]

    Transformer tracking via frequency fusion,

    X. Hu, B. Zhong, Q. Liang, S. Zhang, N. Li, X. Li, and R. Ji, “Transformer tracking via frequency fusion,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 34, no. 2, pp. 1020– 1031, 2024

  54. [54]

    Towards building more robust models with frequency bias,

    Q. Bu, D. Huang, and H. Cui, “Towards building more robust models with frequency bias,” in2023 IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 4379–4388

  55. [55]

    Wave-vit: Unifying wavelet and transformers for visual representation learning,

    T. Yao, Y . Pan, Y . Li, C.-W. Ngo, and T. Mei, “Wave-vit: Unifying wavelet and transformers for visual representation learning,” inCom- puter Vision – ECCV 2022, S. Avidan, G. Brostow, M. Ciss ´e, G. M. Farinella, and T. Hassner, Eds. Cham: Springer Nature Switzerland, 2022, pp. 328–345

  56. [56]

    Uncertainty-aware source-free adaptive image super-resolution with wavelet augmentation transformer,

    Y . Ai, X. Zhou, H. Huang, L. Zhang, and R. He, “Uncertainty-aware source-free adaptive image super-resolution with wavelet augmentation transformer,” in2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 8142–8152

  57. [57]

    Frequency- aware feature fusion for dense image prediction,

    L. Chen, Y . Fu, L. Gu, C. Yan, T. Harada, and G. Huang, “Frequency- aware feature fusion for dense image prediction,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 46, no. 12, pp. 10 763– 10 780, 2024

  58. [58]

    Momfnet: A deep learning approach for insar phase filtering based on multi-objective multi-kernel feature extraction,

    X. Zhang, C. Peng, Z. Li, Y . Zhang, Y . Liu, and Y . Wang, “Momfnet: A deep learning approach for insar phase filtering based on multi-objective multi-kernel feature extraction,”Sensors, vol. 24, no. 23, 2024. [Online]. Available: https://www.mdpi.com/1424-8220/24/23/7821

  59. [59]

    Synthetic aperture radar image despeckling based on a deep learning network employing frequency domain decomposition,

    X. Zhao, F. Ren, H. Sun, and Q. Qi, “Synthetic aperture radar image despeckling based on a deep learning network employing frequency domain decomposition,”Electronics, vol. 13, no. 3, 2024. [Online]. Available: https://www.mdpi.com/2079-9292/13/3/490

  60. [60]

    IMAGGarment-1: Fine-grained garment generation for controllable fashion design,

    F. Shen, J. Yu, C. Wang, X. Jiang, X. Du, and J. Tang, “Imaggarment-1: Fine-grained garment generation for controllable fashion design,”arXiv preprint arXiv:2504.13176, 2025

  61. [61]

    Imagdressing-v1: Customizable virtual dressing,

    F. Shen, X. Jiang, X. He, H. Ye, C. Wang, X. Du, Z. Li, and J. Tang, “Imagdressing-v1: Customizable virtual dressing,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 39, no. 7, 2025, pp. 6795–6804

  62. [62]

    Long-term talkingface generation via motion-prior conditional diffusion model,

    F. Shen, C. Wang, J. Gao, Q. Guo, J. Dang, J. Tang, and T.-S. Chua, “Long-term talkingface generation via motion-prior conditional diffusion model,” inForty-second International Conference on Machine Learning, 2025. [Online]. Available: https://openreview.net/forum?id= aINERD9MzJ

  63. [63]

    U-net: Convolutional networks for biomedical image segmentation,

    O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” inInternational Conference on Medical image computing and computer-assisted intervention. Springer, 2015, pp. 234–241

  64. [64]

    Resunet++: An advanced archi- tecture for medical image segmentation,

    D. Jha, P. H. Smedsrud, M. A. Riegler, D. Johansen, T. D. Lange, P. Halvorsen, and H. D. Johansen, “Resunet++: An advanced archi- tecture for medical image segmentation,” in2019 IEEE International Symposium on Multimedia (ISM), 2019, pp. 225–2255

  65. [65]

    Rethinking Atrous Convolution for Semantic Image Segmentation

    L.-C. Chen, G. Papandreou, F. Schroff, and H. Adam, “Rethinking atrous convolution for semantic image segmentation,”arXiv preprint arXiv:1706.05587, 2017

  66. [66]

    Fully convolutional networks for semantic segmentation,

    J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” inProceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 3431–3440

  67. [67]

    Segformer: Simple and efficient design for semantic segmentation with transformers,

    E. Xie, W. Wang, Z. Yu, A. Anandkumar, J. M. Alvarez, and P. Luo, “Segformer: Simple and efficient design for semantic segmentation with transformers,”Advances in neural information processing systems, vol. 34, pp. 12 077–12 090, 2021

  68. [68]

    Per-pixel classification is not all you need for semantic segmentation,

    B. Cheng, A. Schwing, and A. Kirillov, “Per-pixel classification is not all you need for semantic segmentation,”Advances in neural information processing systems, vol. 34, pp. 17 864–17 875, 2021. 14

  69. [69]

    A dual-stream coding fusion network for landslide mapping based on multisource remote sensing images,

    Y . He, H. Chen, Q. Zhu, Q. Zhang, W. Yang, J. Yang, W. Shi, L. Gao, and M. Filonchyk, “A dual-stream coding fusion network for landslide mapping based on multisource remote sensing images,”IEEE Transactions on Geoscience and Remote Sensing, vol. 63, pp. 1–21, 2025

  70. [70]

    Rsprompter: Learning to prompt for remote sensing instance seg- mentation based on visual foundation model,

    K. Chen, C. Liu, H. Chen, H. Zhang, W. Li, Z. Zou, and Z. Shi, “Rsprompter: Learning to prompt for remote sensing instance seg- mentation based on visual foundation model,”IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1–17, 2024

  71. [71]

    Integrated psinsar and sbas-insar analysis for landslide detection and monitoring,

    S. Hussain, B. Pan, W. Hussain, M. M. Sajjad, M. Ali, Z. Afzal, M. Abdullah-Al-Wadud, and A. Tariq, “Integrated psinsar and sbas-insar analysis for landslide detection and monitoring,”Physics and Chemistry of the Earth, Parts A/B/C, vol. 139, p. 103956, 2025. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1474706525001068