pith. sign in

arxiv: 2605.20569 · v1 · pith:AAN3WGN6new · submitted 2026-05-20 · 💻 cs.CV

End-to-End Unmixing with Material Prompts for Hyperspectral Object Tracking

Pith reviewed 2026-05-21 06:15 UTC · model grok-4.3

classification 💻 cs.CV
keywords hyperspectral object trackingspectral unmixingmaterial decompositionwavelet material promptsend-to-end optimizationtarget-oriented lossfrequency decompositioncomputer vision
0
0 comments X

The pith

Hyperspectral object tracking improves when material decomposition and target localization are jointly optimized with a target-oriented unmixing loss.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper treats hyperspectral object tracking as a single optimization problem that simultaneously breaks images into material components and locates the target. These tasks are linked by a weighted loss that forces the material breakdown to support accurate localization rather than treating unmixing as a separate preprocessing step. A decomposition module uses adaptive frequency methods to extract material representations, while a dual-branch wavelet module generates low- and high-frequency material prompts through frequency-domain interactions. The resulting end-to-end system is model-agnostic and delivers stronger tracking under lighting shifts, clutter, and appearance changes on standard hyperspectral benchmarks.

Core claim

Hyperspectral object tracking is formulated as a joint optimization of material decomposition and target localization. The tasks are coupled by a weighted target-oriented unmixing loss that aligns material representations with localization accuracy. The framework introduces a material representation decomposition module that performs deep spectral unmixing via adaptive frequency decomposition, together with a dual-branch wavelet-enhanced material prompt module that learns low- and high-frequency prompts through efficient spatial-material interactions in the frequency domain. The approach is model-agnostic and reaches state-of-the-art results on hyperspectral tracking benchmarks.

What carries the argument

The weighted target-oriented unmixing loss that explicitly couples material decomposition to localization accuracy.

If this is right

  • Material representations become directly optimized for the tracking objective instead of being produced by a decoupled external unmixing pipeline.
  • The model-agnostic design allows the same joint framework to be attached to different unmixing backbones without retraining the entire system.
  • Tracking performance improves under appearance ambiguity, illumination variation, and background clutter by exploiting intrinsic material properties in the hyperspectral data.
  • State-of-the-art results are obtained on standard hyperspectral object tracking benchmarks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The frequency-domain prompt design could be tested on other spectral or multispectral tracking problems where material cues are available but data volume is limited.
  • If the adaptive frequency decomposition proves stable across datasets, similar joint losses might reduce reliance on large labeled hyperspectral video collections.
  • Extending the wavelet branches to handle temporal frequency cues could address motion-induced appearance changes in longer video sequences.
  • The explicit alignment between material maps and localization accuracy offers a template for other tasks that combine decomposition with detection or segmentation.

Load-bearing premise

The joint unmixing loss produces material representations whose alignment with localization accuracy is meaningfully better than what standard tracking supervision alone would achieve.

What would settle it

An ablation that removes or replaces the target-oriented unmixing loss with ordinary tracking supervision and measures whether tracking accuracy on the standard benchmarks stays the same or drops.

Figures

Figures reproduced from arXiv: 2605.20569 by Guanmanyi Fu, Jun Zhou, Kuldip K. Paliwal, Lei Wang, Mohammad Aminul Islam, Wangshu Cai, Xu Han, Zekun Long.

Figure 1
Figure 1. Figure 1: Comparison between external unmixing and our Material Repre [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overview of the proposed E2E-MPT framework. The backbone branch processes hyperspectral template and search images using a frozen [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Architecture of the proposed (a) Wavelet-enhanced Material Prompt [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Performance comparison on three challenging hyperspectral object tracking benchmarks, including (a) HOTC2020, (b) HOTC2023, and (c) HOTC2024. [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Attribute-based comparison on HOTC2020. The radar plot reports [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Qualitative comparison of E2E-MPT against state-of-the-art trackers on six challenging sequences: (a) [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗
Figure 8
Figure 8. Figure 8: Tracking efficiency comparison on HOTC2020. [PITH_FULL_IMAGE:figures/full_fig_p009_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Overlap curves and representative tracking samples of E2E-MPT on HOTC2023. (a) [PITH_FULL_IMAGE:figures/full_fig_p010_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Qualitative visualization of tracking results and score maps. (a) [PITH_FULL_IMAGE:figures/full_fig_p010_10.png] view at source ↗
read the original abstract

Hyperspectral imagery encodes rich material properties that can improve tracking robustness under appearance ambiguity, illumination change, and background clutter. However, due to the limited availability of hyperspectral video data, many existing methods adapt pretrained RGB trackers via spatial or channel fusion strategies, largely neglecting the intrinsic material information in hyperspectral imagery. Moreover, the few material-aware approaches typically rely on external spectral unmixing pipelines that are decoupled from the tracking objective, limiting effective optimization of material representations for target localization. To address these limitations, we formulate hyperspectral object tracking as a joint optimization problem of material decomposition and target localization, coupling the two tasks via a weighted target-oriented unmixing loss that explicitly aligns material representations with localization accuracy. Specifically, we propose a material representation decomposition module for deep learning-based spectral unmixing with adaptive frequency decomposition. Building on the decomposed material representations, we further introduce a dual-branch wavelet-enhanced material prompt module that learns low- and high-frequency material prompts through efficient spatial-material interactions in the frequency domain. The framework is model-agnostic and can be seamlessly generalized to different unmixing backbones. Extensive experiments on standard hyperspectral tracking benchmarks demonstrate state-of-the-art performance and validate the effectiveness of the proposed end-to-end material-aware tracking framework. Code is available at https://github.com/han030927/E2EMPT.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 3 minor

Summary. The manuscript formulates hyperspectral object tracking as a joint optimization of material decomposition and target localization, coupled by a weighted target-oriented unmixing loss. It introduces a material representation decomposition module using adaptive frequency decomposition and a dual-branch wavelet-enhanced material prompt module for learning low- and high-frequency prompts. The framework is presented as model-agnostic, integrable with different unmixing backbones, and reports state-of-the-art results on standard hyperspectral tracking benchmarks, with public code release.

Significance. If the central joint-optimization claim holds after isolating the loss contribution, the work would offer a concrete advance in exploiting intrinsic material information from hyperspectral data for tracking under ambiguity and clutter, moving beyond decoupled unmixing pipelines or simple fusion. The model-agnostic design and code availability are clear strengths that support broader adoption and reproducibility.

major comments (3)
  1. [Section 4.3] Ablation studies (Section 4.3): the reported experiments compare the full model against baselines but do not include a control variant in which the material representation decomposition module and dual-branch wavelet-enhanced material prompt module are trained under standard tracking supervision alone, without the target-oriented unmixing loss. This leaves open whether the claimed alignment benefit arises from the joint loss or from the architectural additions themselves.
  2. [§3.2] §3.2, loss formulation: the weighting factor of the target-oriented unmixing loss is treated as a tunable hyperparameter; the manuscript should report sensitivity analysis or cross-validation results showing that performance gains remain stable across reasonable values of this weight rather than being driven by a single tuned setting.
  3. [Table 2] Table 2 (quantitative comparisons): while SOTA numbers are listed, the improvements over the strongest material-aware baseline are modest on several sequences; the paper should clarify whether these gains are statistically significant across multiple random seeds or runs to support the claim that the end-to-end coupling is load-bearing.
minor comments (3)
  1. [Figure 3] Figure 3: the frequency-domain interaction diagram would benefit from explicit annotation of the low- and high-frequency branches to improve readability.
  2. [§3.1] Notation: the symbol for the adaptive frequency decomposition operator is introduced without a clear definition in the main text; a short equation or pseudocode block would help.
  3. [Section 2] Related work: the discussion of prior spectral unmixing methods could include a brief comparison table of computational complexity to contextualize the efficiency claims of the proposed modules.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. We address each major comment below and have updated the manuscript to incorporate the suggested analyses and controls where feasible.

read point-by-point responses
  1. Referee: [Section 4.3] Ablation studies (Section 4.3): the reported experiments compare the full model against baselines but do not include a control variant in which the material representation decomposition module and dual-branch wavelet-enhanced material prompt module are trained under standard tracking supervision alone, without the target-oriented unmixing loss. This leaves open whether the claimed alignment benefit arises from the joint loss or from the architectural additions themselves.

    Authors: We agree that this control experiment would more directly isolate the contribution of the joint optimization. The current ablations focus on module contributions and overall comparisons but do not explicitly train the proposed modules under tracking supervision without the target-oriented unmixing loss. In the revised manuscript, we add this control variant in Section 4.3. The results show degraded tracking performance relative to the full model, indicating that the alignment benefit arises from the coupling via the weighted unmixing loss. revision: yes

  2. Referee: [§3.2] §3.2, loss formulation: the weighting factor of the target-oriented unmixing loss is treated as a tunable hyperparameter; the manuscript should report sensitivity analysis or cross-validation results showing that performance gains remain stable across reasonable values of this weight rather than being driven by a single tuned setting.

    Authors: We concur that demonstrating robustness to the weighting factor is important. The manuscript treats the factor as a hyperparameter but does not include a sensitivity study. We have added a sensitivity analysis in the revised Section 3.2, evaluating performance across a range of weighting values (0.01 to 1.0). The gains remain consistent and superior to baselines, showing that results are not driven by a single tuned setting. revision: yes

  3. Referee: [Table 2] Table 2 (quantitative comparisons): while SOTA numbers are listed, the improvements over the strongest material-aware baseline are modest on several sequences; the paper should clarify whether these gains are statistically significant across multiple random seeds or runs to support the claim that the end-to-end coupling is load-bearing.

    Authors: We acknowledge that improvements are modest on some sequences and that statistical validation across runs would better support the claims. The original experiments report single-run results. In the revised Table 2, we now include mean and standard deviation over five random seeds, along with a note on statistical significance (paired t-test, p-value below 0.05) for the majority of sequences. This confirms the consistency of the gains from the end-to-end coupling. revision: yes

Circularity Check

0 steps flagged

No significant circularity in joint optimization or module design

full rationale

The paper introduces architectural innovations including a material representation decomposition module with adaptive frequency decomposition and a dual-branch wavelet-enhanced material prompt module, coupled through a proposed weighted target-oriented unmixing loss for joint material decomposition and localization. These elements are presented as new contributions, with performance validated via experiments on standard hyperspectral tracking benchmarks rather than by reducing to pre-fitted quantities or self-referential definitions. No load-bearing self-citations, ansatz smuggling, or renaming of known results appear in the derivation chain; the framework is explicitly model-agnostic and the central claims rest on empirical generalization instead of tautological construction from inputs.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

The central claim rests on standard deep-learning assumptions about spectral unmixing being learnable from limited data, plus the new architectural choices of adaptive frequency decomposition and wavelet prompts. No explicit free parameters beyond typical network hyperparameters are named in the abstract.

free parameters (1)
  • weight of target-oriented unmixing loss
    The loss that couples material decomposition to localization accuracy is described as weighted; its specific value is a tunable hyperparameter.
axioms (1)
  • domain assumption Deep learning-based spectral unmixing can produce material representations that are useful for target localization when trained jointly.
    Invoked when the paper states that the unmixing module is coupled to the tracking objective via the weighted loss.
invented entities (1)
  • dual-branch wavelet-enhanced material prompt module no independent evidence
    purpose: Learns low- and high-frequency material prompts through spatial-material interactions in the frequency domain.
    New module introduced to build on the decomposed representations.

pith-pipeline@v0.9.0 · 5796 in / 1438 out tokens · 30262 ms · 2026-05-21T06:15:23.175880+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

55 extracted references · 55 canonical work pages · 1 internal anchor

  1. [1]

    ARTrackV2: Prompting autore- gressive tracker where to look and how to describe,

    Y . Bai, Z. Zhao, Y . Gong, and X. Wei, “ARTrackV2: Prompting autore- gressive tracker where to look and how to describe,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 19 048–19 057. RESEARCH REPORT 11

  2. [2]

    Towards real- world visual tracking with temporal contexts,

    Z. Cao, Z. Huang, L. Pan, S. Zhang, Z. Liu, and C. Fu, “Towards real- world visual tracking with temporal contexts,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 12, pp. 15 834– 15 849, 2023

  3. [3]

    SeqTrack: Sequence to sequence learning for visual object tracking,

    X. Chen, H. Peng, D. Wang, H. Lu, and H. Hu, “SeqTrack: Sequence to sequence learning for visual object tracking,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 14 572–14 581

  4. [4]

    Hy-Tracker: A novel framework for enhancing efficiency and accuracy of object tracking in hyperspectral videos,

    M. A. Islam, W. Xing, J. Zhou, Y . Gao, and K. K. Paliwal, “Hy-Tracker: A novel framework for enhancing efficiency and accuracy of object tracking in hyperspectral videos,”IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1–14, 2024

  5. [5]

    TrackNetV4: Enhancing fast sports object tracking with motion attention maps,

    A. Raj, L. Wang, and T. Gedeon, “TrackNetV4: Enhancing fast sports object tracking with motion attention maps,” inIEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2025, pp. 1–5

  6. [6]

    Exploiting the circulant structure of tracking-by-detection with kernels,

    J. F. Henriques, R. Caseiro, P. Martins, and J. Batista, “Exploiting the circulant structure of tracking-by-detection with kernels,” inEuropean Conference on Computer Vision, 2012, pp. 702–715

  7. [7]

    End-to-end representation learning for correlation filter based tracking,

    J. Valmadre, L. Bertinetto, J. Henriques, A. Vedaldi, and P. H. Torr, “End-to-end representation learning for correlation filter based tracking,” inProceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2017, pp. 2805–2813

  8. [8]

    Fully convolutional siamese fusion networks for object tracking,

    M. Cen and C. Jung, “Fully convolutional siamese fusion networks for object tracking,” inIEEE International Conference on Image Processing, 2018, pp. 3718–3722

  9. [9]

    SiamRPN++: Evolution of siamese visual tracking with very deep networks,

    B. Li, W. Wu, Q. Wang, F. Zhang, J. Xing, and J. Yan, “SiamRPN++: Evolution of siamese visual tracking with very deep networks,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 4282–4291

  10. [10]

    Joint feature learning and relation modeling for tracking: A one-stream framework,

    B. Ye, H. Chang, B. Ma, S. Shan, and X. Chen, “Joint feature learning and relation modeling for tracking: A one-stream framework,” inEuropean Conference on Computer Vision, 2022, pp. 341–357

  11. [11]

    Autoregressive queries for adaptive tracking with spatio-temporal trans- formers,

    J. Xie, B. Zhong, Z. Mo, S. Zhang, L. Shi, S. Song, and R. Ji, “Autoregressive queries for adaptive tracking with spatio-temporal trans- formers,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 19 300–19 309

  12. [12]

    Multi-modal fusion for end-to-end RGB-T track- ing,

    L. Zhang, M. Danelljan, A. Gonzalez-Garcia, J. Van De Weijer, and F. Shahbaz Khan, “Multi-modal fusion for end-to-end RGB-T track- ing,” in2019 IEEE/CVF International Conference on Computer Vision Workshop, 2019, pp. 0–0

  13. [13]

    Revisiting motion information for RGB-Event tracking with mot philosophy,

    T. Zhang, K. Debattista, Q. Zhang, G. Ding, and J. Han, “Revisiting motion information for RGB-Event tracking with mot philosophy,” in Advances in Neural Information Processing Systems, vol. 37, 2024, pp. 89 346–89 372

  14. [14]

    RGBD1K: A large-scale dataset and benchmark for rgb-d object tracking,

    X.-F. Zhu, T. Xu, Z. Tang, Z. Wu, H. Liu, X. Yang, X.-J. Wu, and J. Kittler, “RGBD1K: A large-scale dataset and benchmark for rgb-d object tracking,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 37, 2023, pp. 3870–3878

  15. [15]

    UBSTrack: Unified band selection and multimodel ensemble for hyperspectral object tracking,

    M. A. Islam, J. Zhou, W. Xing, Y . Gao, and K. K. Paliwal, “UBSTrack: Unified band selection and multimodel ensemble for hyperspectral object tracking,”IEEE Transactions on Geoscience and Remote Sensing, 2025

  16. [16]

    Spatial-spectral-temporal correlation filter for hyperspectral object tracking,

    F. Xiong, Y . Sun, J. Zhou, J. Lu, and Y . Qian, “Spatial-spectral-temporal correlation filter for hyperspectral object tracking,”IEEE Transactions on Geoscience and Remote Sensing, 2025

  17. [17]

    Spatial–spectral weighted and regularized tensor sparse correlation filter for object tracking in hyper- spectral videos,

    Z. Hou, W. Li, J. Zhou, and R. Tao, “Spatial–spectral weighted and regularized tensor sparse correlation filter for object tracking in hyper- spectral videos,”IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–12, 2022

  18. [18]

    SiamBAG: Band attention grouping-based siamese object tracking network for hyperspectral videos,

    W. Li, Z. Hou, J. Zhou, and R. Tao, “SiamBAG: Band attention grouping-based siamese object tracking network for hyperspectral videos,”IEEE Transactions on Geoscience and Remote Sensing, vol. 61, pp. 1–12, 2023

  19. [19]

    SPIRIT: Spectral awareness interaction network with dynamic template for hyperspectral object tracking,

    Y . Chen, Q. Yuan, Y . Tang, Y . Xiao, J. He, and L. Zhang, “SPIRIT: Spectral awareness interaction network with dynamic template for hyperspectral object tracking,”IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1–16, 2023

  20. [20]

    A unified spatial-spectral-temporal network for hyperspectral object tracking,

    Z. Li, J. Wang, J. Zhang, D. Zhao, G. Fu, J. Wang, and J. Lu, “A unified spatial-spectral-temporal network for hyperspectral object tracking,”Pattern Recognition, vol. 174, p. 113005, 2026

  21. [21]

    BAE-Net: A band attention aware ensemble network for hyperspectral object tracking,

    Z. Li, F. Xiong, J. Zhou, J. Wang, J. Lu, and Y . Qian, “BAE-Net: A band attention aware ensemble network for hyperspectral object tracking,” in IEEE International Conference on Image Processing, 2020, pp. 2106– 2110

  22. [22]

    Background-aware band selection for object tracking in hyperspectral videos,

    M. A. Islam, J. Zhou, W. Zhang, and Y . Gao, “Background-aware band selection for object tracking in hyperspectral videos,”IEEE Geoscience and Remote Sensing Letters, vol. 20, pp. 1–5, 2023

  23. [23]

    Learning a deep ensemble network with band importance for hyperspectral object tracking,

    Z. Li, F. Xiong, J. Zhou, J. Lu, and Y . Qian, “Learning a deep ensemble network with band importance for hyperspectral object tracking,”IEEE Transactions on Image Processing, vol. 32, pp. 2901–2914, 2023

  24. [24]

    Hyperspectral object tracking with spectral information prompt,

    G. He, L. Gao, L. Chen, Y . Jiang, W. Xie, and Y . Li, “Hyperspectral object tracking with spectral information prompt,”IEEE Transactions on Circuits and Systems for Video Technology, 2025

  25. [25]

    Multi-domain universal representation learning for hyperspectral object tracking,

    Z. Li, F. Xiong, J. Lu, J. Wang, D. Chen, J. Zhou, and Y . Qian, “Multi-domain universal representation learning for hyperspectral object tracking,”Pattern Recognition, vol. 162, p. 111389, 2025

  26. [26]

    SSTtrack: A unified hyperspectral video tracking framework via modeling spectral-spatial-temporal conditions,

    Y . Chen, Q. Yuan, Y . Tang, Y . Xiao, J. He, T. Han, Z. Liu, and L. Zhang, “SSTtrack: A unified hyperspectral video tracking framework via modeling spectral-spatial-temporal conditions,”Information Fusion, vol. 114, p. 102658, 2025

  27. [27]

    Material- guided multiview fusion network for hyperspectral object tracking,

    Z. Li, F. Xiong, J. Zhou, J. Lu, Z. Zhao, and Y . Qian, “Material- guided multiview fusion network for hyperspectral object tracking,” IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1–15, 2024

  28. [28]

    TFTN: A transformer-based fusion tracking framework of hyperspectral and RGB,

    C. Zhao, H. Liu, N. Su, and Y . Yan, “TFTN: A transformer-based fusion tracking framework of hyperspectral and RGB,”IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–15, 2022

  29. [29]

    Hyperspectral object tracking with context-aware learning and category consistency,

    Y . Wang, S. Mei, M. Ma, Y . Liu, T. Gao, and H. Han, “Hyperspectral object tracking with context-aware learning and category consistency,” IEEE Transactions on Geoscience and Remote Sensing, 2025

  30. [30]

    VP-HOT: Visual prompt for hyperspectral object tracking,

    S. Xie, J. Li, L. Zhao, W. Hu, G. Zhang, J. Wu, and X. Li, “VP-HOT: Visual prompt for hyperspectral object tracking,” in2023 13th Workshop on Hyperspectral Imaging and Signal Processing: Evolution in Remote Sensing, 2023, pp. 1–5

  31. [31]

    Material-guided siamese fusion network for hyperspectral object tracking,

    Z. Li, F. Xiong, J. Lu, J. Zhou, and Y . Qian, “Material-guided siamese fusion network for hyperspectral object tracking,” inIEEE International Conference on Acoustics, Speech and Signal Processing, 2022, pp. 2809–2813

  32. [32]

    Object tracking in hyperspectral videos with convolutional features and kernelized correlation filter,

    K. Qian, J. Zhou, F. Xiong, H. Zhou, and J. Du, “Object tracking in hyperspectral videos with convolutional features and kernelized correlation filter,” inInternational Conference on Smart Multimedia, 2018, pp. 308–319

  33. [33]

    Material based object tracking in hyperspectral videos,

    F. Xiong, J. Zhou, and Y . Qian, “Material based object tracking in hyperspectral videos,”IEEE Transactions on Image Processing, vol. 29, pp. 3719–3733, 2020

  34. [34]

    Hyper- spectral object tracking with dual-stream prompt,

    R. Yao, L. Zhang, Y . Zhou, H. Zhu, J. Zhao, and Z. Shao, “Hyper- spectral object tracking with dual-stream prompt,”IEEE Transactions on Geoscience and Remote Sensing, vol. 63, pp. 1–12, 2025

  35. [35]

    Multi modality siamese fea- ture fusion transformer tracker for object tracking from hyperspectral videos,

    M. A. Islam, W. Xing, and J. Zhou, “Multi modality siamese fea- ture fusion transformer tracker for object tracking from hyperspectral videos,” in2023 13th Workshop on Hyperspectral Imaging and Signal Processing: Evolution in Remote Sensing, 2023, pp. 1–5

  36. [36]

    Domain adaptation- aware transformer for hyperspectral object tracking,

    Y . Wu, L. Jiao, X. Liu, F. Liu, S. Yang, and L. Li, “Domain adaptation- aware transformer for hyperspectral object tracking,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 34, pp. 8041–8052, 2024

  37. [37]

    SENSE: Hyperspectral video object tracker via fusing material and motion cues,

    Y . Chen, Q. Yuan, Y . Tang, Y . Xiao, J. He, and Z. Liu, “SENSE: Hyperspectral video object tracker via fusing material and motion cues,” Information Fusion, vol. 109, p. 102395, 2024

  38. [38]

    PHTrack: Prompting for hyperspectral video tracking,

    Y . Chen, Y . Tang, X. Su, J. Li, Y . Xiao, J. He, and Q. Yuan, “PHTrack: Prompting for hyperspectral video tracking,”IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1–18, 2024

  39. [39]

    Visual prompt tuning,

    M. Jia, L. Tang, B.-C. Chen, C. Cardie, S. Belongie, B. Hariharan, and S.-N. Lim, “Visual prompt tuning,” inEuropean Conference on Computer Vision, 2022, pp. 709–727

  40. [40]

    Visual prompt multi- modal tracking,

    J. Zhu, S. Lai, X. Chen, D. Wang, and H. Lu, “Visual prompt multi- modal tracking,” inProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition, 2023, pp. 9516–9526

  41. [41]

    Improving visual object tracking through visual prompting,

    S.-F. Chen, J.-C. Chen, I.-H. Jhuo, and Y .-Y . Lin, “Improving visual object tracking through visual prompting,”IEEE Transactions on Mul- timedia, vol. 27, pp. 2682–2694, 2025

  42. [42]

    Learning transferable visual models from natural language supervision,

    A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clarket al., “Learning transferable visual models from natural language supervision,” inInternational Conference on Machine Learning, 2021, pp. 8748–8763

  43. [43]

    Hyperspectral unmixing using a neural network autoencoder,

    B. Palsson, J. Sigurdsson, J. R. Sveinsson, and M. O. Ulfarsson, “Hyperspectral unmixing using a neural network autoencoder,”IEEE Access, vol. 6, pp. 25 646–25 656, 2018

  44. [44]

    Convolutional autoen- coder for spectral–spatial hyperspectral unmixing,

    B. Palsson, M. O. Ulfarsson, and J. R. Sveinsson, “Convolutional autoen- coder for spectral–spatial hyperspectral unmixing,”IEEE Transactions on Geoscience and Remote Sensing, vol. 59, no. 1, pp. 535–549, 2020

  45. [45]

    Multi-stage convo- lutional autoencoder network for hyperspectral unmixing,

    Y . Yu, Y . Ma, X. Mei, F. Fan, J. Huang, and H. Li, “Multi-stage convo- lutional autoencoder network for hyperspectral unmixing,”International RESEARCH REPORT 12 Journal of Applied Earth Observation and Geoinformation, vol. 113, p. 102981, 2022

  46. [46]

    Spectral- spatial boundary detection in hyperspectral images,

    S. L. Al-Khafaji, J. Zhou, X. Bai, Y . Qian, and A. W.-C. Liew, “Spectral- spatial boundary detection in hyperspectral images,”IEEE Transactions on Image Processing, pp. 499–512, 2022

  47. [47]

    UnmixDiff: Unmixing-based diffusion model for hyperspectral image synthesis,

    Y . Yu, E. Pan, Y . Ma, X. Mei, Q. Chen, and J. Ma, “UnmixDiff: Unmixing-based diffusion model for hyperspectral image synthesis,” IEEE Transactions on Geoscience and Remote Sensing, 2024

  48. [48]

    Material based salient object detection from hyperspectral images,

    J. Liang, J. Zhou, L. Tong, X. Bai, and B. Wang, “Material based salient object detection from hyperspectral images,”Pattern Recognition, pp. 476–490, 2018

  49. [49]

    Nonnegative sparse autoencoder for robust endmember extraction from remotely sensed hyperspectral images,

    Y . Su, A. Marinoni, J. Li, A. Plaza, and P. Gamba, “Nonnegative sparse autoencoder for robust endmember extraction from remotely sensed hyperspectral images,” inIEEE International Geoscience and Remote Sensing Symposium, 2017, pp. 205–208

  50. [50]

    Separable self and mixed attention trans- formers for efficient object tracking,

    G. Y . Gopal and M. A. Amer, “Separable self and mixed attention trans- formers for efficient object tracking,” inProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024, pp. 6708– 6717

  51. [51]

    Explicit visual prompts for visual object tracking,

    L. Shi, B. Zhong, Q. Liang, N. Li, S. Zhang, and X. Li, “Explicit visual prompts for visual object tracking,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 38, 2024, pp. 4838–4846

  52. [52]

    Robust visual tracking via convolutional networks without training,

    K. Zhang, Q. Liu, Y . Wu, and M.-H. Yang, “Robust visual tracking via convolutional networks without training,”IEEE Transactions on Image Processing, vol. 25, no. 4, pp. 1779–1792, 2016

  53. [53]

    Decoupled Weight Decay Regularization

    I. Loshchilov and F. Hutter, “Decoupled weight decay regularization,” arXiv preprint arXiv:1711.05101, 2017

  54. [54]

    Collaborative sparse regression for hyperspectral unmixing,

    M.-D. Iordache, J. M. Bioucas-Dias, and A. Plaza, “Collaborative sparse regression for hyperspectral unmixing,”IEEE Transactions on Geoscience and Remote Sensing, vol. 52, no. 1, pp. 341–354, 2013. Xu Hanreceived the Master of degree in Informa- tion Technology from the University of New South Wales, Sydney, NSW, Australia, in 2024. He is currently pursui...

  55. [55]

    In June 2012, he joined the School of In- formation and Communication Technology, Griffith University, Nathan, QLD, Australia, where he is currently a Professor. Before this appointment, he was a Research Fellow in the Research School of Computer Science at the Australian National University, Canberra, ACT, Australia, and a Researcher at the Canberra Rese...