pith. machine review for the scientific record. sign in

arxiv: 2604.19369 · v1 · submitted 2026-04-21 · 💻 cs.CV

Recognition: unknown

IonMorphNet: Generalizable Learning of Ion Image Morphologies for Peak Picking in Mass Spectrometry Imaging

Authors on Pith no claims yet

Pith reviewed 2026-05-10 02:19 UTC · model grok-4.3

classification 💻 cs.CV
keywords mass spectrometry imagingpeak pickingion imagesstructural classificationconvolutional neural networkgeneralizable learningtumor classification
0
0 comments X

The pith

A neural network trained on six spatial patterns in ion images enables peak picking in mass spectrometry imaging without dataset-specific tuning.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces IonMorphNet to classify spatial structures in ion images from mass spectrometry imaging. It curates 53 public datasets and defines six structural classes to train a standard image backbone for pattern recognition. Once trained, the model performs peak picking that generalizes across different acquisition protocols without extra supervision or tuning. This matters because existing methods demand careful per-dataset adjustments and often fail to work broadly. The same structural information also supports better 3D CNN performance on tumor classification tasks.

Core claim

IonMorphNet is a spatial-structure-aware representation model trained to classify ion images into six structural classes from 53 MSI datasets; once trained, it performs fully data-driven peak picking without task-specific supervision or hyperparameter tuning and improves mSCF1 by 7 percent over state-of-the-art methods across multiple datasets.

What carries the argument

The six structural classes of ion image spatial patterns, used to supervise training of a ConvNeXt V2-Tiny backbone whose classifications then directly inform peak picking decisions.

Load-bearing premise

The six author-defined structural classes capture representative spatial patterns sufficient to generalize peak picking decisions across acquisition protocols and datasets without any task-specific supervision or further tuning.

What would settle it

Applying the trained model to a new MSI dataset with acquisition protocols and spatial patterns absent from the 53 training sets and finding that peak picking performance drops below or equals that of carefully tuned traditional methods.

Figures

Figures reproduced from arXiv: 2604.19369 by Carsten Hopf, Niels Nawrot, Nikolas Ebert, Oliver Wasenm\"uller, Philipp Weigand.

Figure 1
Figure 1. Figure 1: Overview of IonMorphNet, a foundational ion image encoder for Mass Spectrometry Imaging (MSI). IonMorphNet is pre-trained on morphology classification across diverse organ￾isms, organs and instrument types. The frozen encoder performs ion image assessment, thereby enabling spatially grounded peak picking across diverse datasets. Furthermore, spatially grounded peak picking enables tumor classification with… view at source ↗
Figure 2
Figure 2. Figure 2: Overview of all 53 collected public datasets regarding ion source, analyzer, organism and organism part. mance of a peak picking method, called the mean spatial correlation F1 Score: mSCF1. We will use the mSCF1 to evaluate our proposed IonMorphNet and compare it against the current state-of-the-art. 2.2. Tumor Classification in Mass Spectrometry Imaging The spectral information in mass spectra enables the… view at source ↗
Figure 3
Figure 3. Figure 3: Overview of the six structural classes we used to label all [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Detailed peak picking results of our model compared to the State of the Art [ [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Tumor classification results of a 3D CNN [ [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
read the original abstract

Peak picking is a fundamental preprocessing step in Mass Spectrometry Imaging (MSI), where each sample is represented by hundreds to thousands of ion images. Existing approaches require careful dataset-specific hyperparameter tuning, and often fail to generalize across acquisition protocols. We introduce IonMorphNet, a spatial-structure-aware representation model for ion images that enables fully data-driven peak picking without any task-specific supervision. We curate 53 publicly available MSI datasets and define six structural classes capturing representative spatial patterns in ion images to train standard image backbones for structural pattern classification. Once trained, IonMorphNet can assess ion images and perform peak picking without additional hyperparameter tuning. Using a ConvNeXt V2-Tiny backbone, our approach improves peak picking performance by +7 % mSCF1 compared to state-of-the-art methods across multiple datasets. Beyond peak picking, we demonstrate that spatially informed channel reduction enables a 3D CNN for patch-based tumor classification in MSI. This approach matches or exceeds pixel-wise spectral classifiers by up to +7.3 % Balanced Accuracy on three tumor classification tasks, indicating meaningful ion image selection. The source code and model weights are available at https://github.com/CeMOS-IS/IonMorphNet.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper introduces IonMorphNet, a ConvNeXt V2-Tiny backbone trained to classify ion images from 53 public MSI datasets into six author-defined structural classes. It claims this enables fully data-driven peak picking without task-specific supervision or hyperparameter tuning, reporting a +7% mSCF1 improvement over state-of-the-art methods across multiple datasets. The work also shows that the learned representations support channel reduction for 3D CNN patch-based tumor classification, yielding up to +7.3% balanced accuracy gains on three tasks. Code and model weights are released publicly.

Significance. If the claims hold after addressing validation gaps, the work could meaningfully advance MSI preprocessing by replacing manual tuning with a generalizable learned model for peak picking. The public release of code and weights is a clear strength that aids reproducibility. The downstream tumor classification results suggest the spatial representations capture useful morphology beyond peak picking. Significance is currently moderated by the absence of key evaluation details needed to confirm the performance gains and generalization.

major comments (2)
  1. [Abstract] Abstract: The reported +7% mSCF1 and +7.3% balanced-accuracy gains supply no information on how the classification output is converted into peak decisions, which baselines were used, how many datasets were held out, or whether numbers include error bars or statistical tests. These omissions are load-bearing for the central claim of improved generalizable peak picking.
  2. [Abstract] Abstract and class definition: The six author-defined structural classes are manually curated without reported inter-rater reliability, correlation analysis to ground-truth peak lists, or ablation showing that misclassification in any class degrades mSCF1. This leaves the mapping from predicted class to keep/reject decisions as an unvalidated heuristic rather than a demonstrated generalizable rule across acquisition protocols.
minor comments (1)
  1. [Abstract] The abstract refers to 'state-of-the-art methods' without naming the specific baselines or citing their original papers.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments on our work. Below, we provide point-by-point responses to the major comments, clarifying aspects of the manuscript and outlining revisions to address the concerns.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The reported +7% mSCF1 and +7.3% balanced-accuracy gains supply no information on how the classification output is converted into peak decisions, which baselines were used, how many datasets were held out, or whether numbers include error bars or statistical tests. These omissions are load-bearing for the central claim of improved generalizable peak picking.

    Authors: The abstract provides a high-level overview of the contributions, while the detailed explanations are presented in the main body of the paper. In the Methods section, we explain that the classification output is mapped to peak decisions using a fixed set of rules based on the predicted structural class, without requiring dataset-specific tuning. The baselines used are the current state-of-the-art peak picking methods, as described in the Experiments section. The evaluation involves training on a subset of the 53 datasets and testing on held-out datasets to assess generalizability. The reported performance gains are mean values across these test sets, with error bars and statistical analyses included in the results tables and figures. To improve the abstract's informativeness, we will revise it to briefly describe the peak decision process and evaluation setup. revision: yes

  2. Referee: [Abstract] Abstract and class definition: The six author-defined structural classes are manually curated without reported inter-rater reliability, correlation analysis to ground-truth peak lists, or ablation showing that misclassification in any class degrades mSCF1. This leaves the mapping from predicted class to keep/reject decisions as an unvalidated heuristic rather than a demonstrated generalizable rule across acquisition protocols.

    Authors: The six structural classes were defined by the authors to capture key spatial patterns observed in ion images from the curated datasets. These definitions are supported by visual examples throughout the paper. We did not report inter-rater reliability as the classes were developed through collaborative expert review by the authors. The effectiveness of the classes is validated through the improved peak picking performance, which serves as a proxy for correlation with ground-truth peak quality. We agree that an ablation study would be beneficial to show the impact of misclassifications; we will add such an analysis in the revised version to demonstrate that errors in certain classes affect mSCF1 as expected. This will help confirm the generalizability of the class-to-decision mapping across different acquisition protocols. revision: partial

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper curates 53 public MSI datasets, manually defines six structural classes as training labels, trains a standard ConvNeXt V2-Tiny classifier on those labels, and evaluates peak-picking performance (mSCF1) against external state-of-the-art baselines on multiple datasets. No equation, procedure, or self-citation reduces the reported performance gain to a fitted quantity on the evaluation data, a self-referential definition of the classes, or a load-bearing prior result from the same authors. The class taxonomy is an input to supervised training rather than an output derived from the model's peak-picking decisions, and the evaluation uses independent ground-truth peak lists.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the premise that six author-defined visual categories are sufficient to decide peak relevance across diverse MSI acquisition settings.

axioms (1)
  • domain assumption Ion images from MSI can be meaningfully partitioned into six structural classes that represent their spatial patterns.
    Authors curate and label the classes to supervise the backbone training.

pith-pipeline@v0.9.0 · 5529 in / 1325 out tokens · 51672 ms · 2026-05-10T02:19:42.077369+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

45 extracted references · 7 canonical work pages · 5 internal anchors

  1. [1]

    Peak learning of mass spectrometry imaging data us- ing artificial neural networks.Nat

    Walid M Abdelmoula, Begona Gimenez-Cassina Lopez, Elizabeth C Randall, Tina Kapur, Jann N Sarkaria, Forest M White, Jeffrey N Agar, William M Wells, and Nathalie YR Agar. Peak learning of mass spectrometry imaging data us- ing artificial neural networks.Nat. Commun., 12(1):5544,

  2. [2]

    Walid M Abdelmoula, Sylwia A Stopka, Elizabeth C Ran- dall, Michael Regan, Jeffrey N Agar, Jann N Sarkaria, William M Wells, Tina Kapur, and Nathalie YR Agar. mass- net: integrated processing and classification of spatially re- solved mass spectrometry data using deep learning for rapid tumor delineation.Bioinformatics, 38(7):2015–2021, 2022. 2, 3, 5, 6, 8

  3. [3]

    Spatial probabilistic mapping of metabolite ensembles in mass spectrometry imaging.Nat

    Denis Abu Sammour, James L Cairns, Tobias Boskamp, Christian Marsching, Tobias Kessler, Carina Ramallo Gue- vara, Verena Panitz, Ahmed Sadik, Jonas Cordes, Stefan Schmidt, et al. Spatial probabilistic mapping of metabolite ensembles in mass spectrometry imaging.Nat. Commun., 14 (1):1823, 2023. 1

  4. [4]

    Testing for pres- ence of known and unknown molecules in imaging mass spectrometry.Bioinformatics, 29(18):2335–2342, 2013

    Theodore Alexandrov and Andreas Bartels. Testing for pres- ence of known and unknown molecules in imaging mass spectrometry.Bioinformatics, 29(18):2335–2342, 2013. 1, 2

  5. [5]

    Metaspace: A community-populated knowledge base of spatial metabolomes in health and dis- ease.BioRxiv, page 539478, 2019

    Theodore Alexandrov, Katja Ovchinnikova, Andrew Palmer, Vitaly Kovalev, Artem Tarasov, Lachlan Stuart, Renat Nig- metzianov, Dominik Fay, Key METASPACE contributors, Mathieu Gaudin, et al. Metaspace: A community-populated knowledge base of spatial metabolomes in health and dis- ease.BioRxiv, page 539478, 2019. 3

  6. [6]

    Adrian Arendowski. Matrix-and surface-assisted laser des- orption/ionization mass spectrometry methods for urological cancer biomarker discovery—metabolomics and lipidomics approaches.Metabolites, 14(3):173, 2024. 1

  7. [7]

    Deep learn- ing for tumor classification in imaging mass spectrometry

    Jens Behrmann, Christian Etmann, Tobias Boskamp, Rita Casadonte, J ¨org Kriegsmann, and Peter Maaβ. Deep learn- ing for tumor classification in imaging mass spectrometry. Bioinformatics, 34(7):1215–1223, 2018. 2, 3, 7

  8. [8]

    Cardinalworkflows: Datasets and workflows for the cardinal mass spectrometry imaging package.R package version, 1(0), 2019

    KA Bemis. Cardinalworkflows: Datasets and workflows for the cardinal mass spectrometry imaging package.R package version, 1(0), 2019. 3, 5, 6, 8

  9. [9]

    Cardinal: an r package for sta- tistical analysis of mass spectrometry-based imaging experi- ments.Bioinformatics, 31(14):2418–2420, 2015

    Kyle D Bemis, April Harry, Livia S Eberlin, Christina Ferreira, Stephanie M van de Ven, Parag Mallick, Mark Stolowitz, and Olga Vitek. Cardinal: an r package for sta- tistical analysis of mass spectrometry-based imaging experi- ments.Bioinformatics, 31(14):2418–2420, 2015. 1, 2

  10. [10]

    Xception: Deep learning with depthwise separable convolutions

    Franc ¸ois Chollet. Xception: Deep learning with depthwise separable convolutions. InProceedings of the IEEE con- ference on computer vision and pattern recognition, pages 1251–1258, 2017. 7

  11. [11]

    pym2aia: Python interface for mass spectrometry imaging with focus on deep learning.Bioinformatics, 40(3):btae133,

    Jonas Cordes, Thomas Enzlein, Carsten Hopf, and Ivo Wolf. pym2aia: Python interface for mass spectrometry imaging with focus on deep learning.Bioinformatics, 40(3):btae133,

  12. [12]

    Imagenet: A large-scale hierarchical image database

    Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009. 2, 3, 7

  13. [13]

    Improved Regularization of Convolutional Neural Networks with Cutout

    Terrance DeVries and Graham W Taylor. Improved regular- ization of convolutional neural networks with cutout.arXiv preprint arXiv:1708.04552, 2017. 4

  14. [14]

    An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

    Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Syl- vain Gelly, et al. An image is worth 16x16 words: Trans- formers for image recognition at scale.arXiv preprint arXiv:2010.11929, 2020. 2

  15. [15]

    Hy- perspectral benchmark: Bridging the gap between hsi ap- plications through comprehensive dataset and pretraining

    Hannah Frank, Leon Amadeus Varga, and Andreas Zell. Hy- perspectral benchmark: Bridging the gap between hsi ap- plications through comprehensive dataset and pretraining. arXiv preprint arXiv:2309.11122, 2023. 7

  16. [16]

    A vision system for surface roughness char- acterization using the gray level co-occurrence matrix.NDT & e International, 37(7):577–588, 2004

    ES Gadelmawla. A vision system for surface roughness char- acterization using the gray level co-occurrence matrix.NDT & e International, 37(7):577–588, 2004. 2

  17. [17]

    Maldiquant: a ver- satile r package for the analysis of mass spectrometry data

    Sebastian Gibb and Korbinian Strimmer. Maldiquant: a ver- satile r package for the analysis of mass spectrometry data. Bioinformatics, 28(17):2270–2271, 2012. 1, 2, 6, 8

  18. [18]

    Deep multiple instance learning classifies subtissue locations in mass spectrometry images from tissue- level annotations.Bioinformatics, 36(Supplement 1):i300– i308, 2020

    Dan Guo, Melanie Christine F ¨oll, Veronika V olkmann, Kathrin Enderle-Ammour, Peter Bronsert, Oliver Schilling, and Olga Vitek. Deep multiple instance learning classifies subtissue locations in mass spectrometry images from tissue- level annotations.Bioinformatics, 36(Supplement 1):i300– i308, 2020. 3

  19. [19]

    Deep residual learning for image recognition

    Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. InProceed- ings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016. 7

  20. [20]

    Deep learning and 3d-desi imaging reveal the hidden metabolic heterogeneity of cancer.Chem

    Paolo Inglese, James S McKenzie, Anna Mroz, James Kinross, Kirill Veselkov, Elaine Holmes, Zoltan Takats, Jeremy K Nicholson, and Robert C Glen. Deep learning and 3d-desi imaging reveal the hidden metabolic heterogeneity of cancer.Chem. Sci., 8(5):3500–3511, 2017. 2, 3, 5, 6, 8

  21. [21]

    Sputnik: an r package for fil- tering of spatially related peaks in mass spectrometry imag- ing data.Bioinformatics, 35(1):178–180, 2019

    Paolo Inglese, Gonc ¸alo Correia, Zoltan Takats, Jeremy K Nicholson, and Robert C Glen. Sputnik: an r package for fil- tering of spatially related peaks in mass spectrometry imag- ing data.Bioinformatics, 35(1):178–180, 2019. 1, 2, 6

  22. [22]

    Classification of pancreatic ductal adenocarcinoma using maldi mass spectrometry imaging combined with neu- ral networks.Cancers, 15(3):686, 2023

    Frederic Kanter, Jan Lellmann, Herbert Thiele, Steve Kalloger, David F Schaeffer, Axel Wellmann, and Oliver Klein. Classification of pancreatic ductal adenocarcinoma using maldi mass spectrometry imaging combined with neu- ral networks.Cancers, 15(3):686, 2023. 2, 3, 7

  23. [23]

    Auto-Encoding Variational Bayes

    Diederik P Kingma. Auto-encoding variational bayes.arXiv preprint arXiv:1312.6114, 2013. 2

  24. [24]

    Mass spectrometry imaging for reliable and fast classification of non-small cell lung cancer subtypes

    Mark Kriegsmann, Christiane Zgorzelski, Rita Casadonte, Kristina Schwamborn, Thomas Muley, Hauke Winter, Mar- tin Eichhorn, Florian Eichhorn, Arne Warth, Soeren-Oliver Deininger, et al. Mass spectrometry imaging for reliable and fast classification of non-small cell lung cancer subtypes. Cancers, 12(9):2704, 2020. 2, 7

  25. [25]

    Learning multiple layers of features from tiny images

    Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. 2009. 3

  26. [26]

    Peak detection for maldi mass spectrometry imaging data using sparse frame multipliers.J

    Florian Lieb, Tobias Boskamp, and Hans-Georg Stark. Peak detection for maldi mass spectrometry imaging data using sparse frame multipliers.J. Proteomics, 225:103852, 2020. 1, 2, 6

  27. [27]

    Microsoft coco: Common objects in context

    Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Doll´ar, and C Lawrence Zitnick. Microsoft coco: Common objects in context. In European conference on computer vision, pages 740–755. Springer, 2014. 3

  28. [28]

    Swin transformer v2: Scaling up capacity and resolution

    Ze Liu, Han Hu, Yutong Lin, Zhuliang Yao, Zhenda Xie, Yixuan Wei, Jia Ning, Yue Cao, Zheng Zhang, Li Dong, et al. Swin transformer v2: Scaling up capacity and resolution. In Proceedings of the IEEE/CVF conference on computer vi- sion and pattern recognition, pages 12009–12019, 2022. 7

  29. [29]

    SGDR: Stochastic Gradient Descent with Warm Restarts

    Ilya Loshchilov and Frank Hutter. Sgdr: Stochas- tic gradient descent with warm restarts.arXiv preprint arXiv:1608.03983, 2016. 5

  30. [30]

    Fdr- controlled metabolite annotation for high-resolution imaging mass spectrometry.Nat

    Andrew Palmer, Prasad Phapale, Ilya Chernyavsky, Regis Lavigne, Dominik Fay, Artem Tarasov, Vitaly Kovalev, Jens Fuchser, Sergey Nikolenko, Charles Pineau, et al. Fdr- controlled metabolite annotation for high-resolution imaging mass spectrometry.Nat. Methods, 14(1):57–60, 2017. 2

  31. [31]

    Deep learning classifiers for hyperspec- tral imaging: A review.ISPRS Journal of Photogrammetry and Remote Sensing, 158:279–317, 2019

    Mercedes Eugenia Paoletti, Juan Mario Haut, Javier Plaza, and Antonio Plaza. Deep learning classifiers for hyperspec- tral imaging: A review.ISPRS Journal of Photogrammetry and Remote Sensing, 158:279–317, 2019. 7, 8

  32. [32]

    Karl Pearson. Liii. on lines and planes of closest fit to sys- tems of points in space.The London, Edinburgh, and Dublin philosophical magazine and journal of science, 2(11):559– 572, 1901. 2

  33. [33]

    Learning transferable visual models from natural language supervi- sion

    Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervi- sion. InInternational conference on machine learning, pages 8748–8763. PmLR, 2021. 2

  34. [34]

    Miriam F Rittel, Nikolas Ebert, Denis Abu Sammour, Se- bastian Graf, Bj ¨orn C Fr ¨ohlich, Emrullah Birgin, Shad A Mohammed, Nuh N Rahbari, Axel Wellmann, Oliver Wasenm¨uller, et al. Cohort-scale spatial autocorrelation for tumor prediction in mid-infrared pathology and spatial biomarker discovery using maldi imaging lipidomics.Ad- vanced Science, page e1...

  35. [35]

    Advanced maldi mass spectrom- etry imaging in pharmaceutical research and drug develop- ment.Curr

    Sandra Schulz, Michael Becker, M Reid Groseclose, Simone Schadt, and Carsten Hopf. Advanced maldi mass spectrom- etry imaging in pharmaceutical research and drug develop- ment.Curr. Opin. Biotechnol., 55:51–59, 2019. 1

  36. [36]

    DINOv3

    Oriane Sim ´eoni, Huy V V o, Maximilian Seitzer, Federico Baldassarre, Maxime Oquab, Cijo Jose, Vasil Khalidov, Marc Szafraniec, Seungeun Yi, Micha ¨el Ramamonjisoa, et al. Dinov3.arXiv preprint arXiv:2508.10104, 2025. 2, 7

  37. [37]

    Dropout: a simple way to prevent neural networks from overfitting.The journal of machine learning research, 15(1):1929–1958, 2014

    Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. Dropout: a simple way to prevent neural networks from overfitting.The journal of machine learning research, 15(1):1929–1958, 2014. 5

  38. [38]

    Efficientnet: Rethinking model scaling for convolutional neural networks

    Mingxing Tan and Quoc Le. Efficientnet: Rethinking model scaling for convolutional neural networks. InInternational conference on machine learning, pages 6105–6114. PMLR,

  39. [39]

    Training data-efficient image transformers & distillation through at- tention

    Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablayrolles, and Herv ´e J´egou. Training data-efficient image transformers & distillation through at- tention. InInternational conference on machine learning, pages 10347–10357. PMLR, 2021. 7

  40. [40]

    Pvt v2: Improved baselines with pyramid vision transformer

    Wenhai Wang, Enze Xie, Xiang Li, Deng-Ping Fan, Kaitao Song, Ding Liang, Tong Lu, Ping Luo, and Ling Shao. Pvt v2: Improved baselines with pyramid vision transformer. Computational visual media, 8(3):415–424, 2022. 7

  41. [41]

    Mohammed, De- nis Abu Sammour, Carsten Hopf, and Oliver Wasenm ¨uller

    Philipp Weigand, Nikolas Ebert, Shad A. Mohammed, De- nis Abu Sammour, Carsten Hopf, and Oliver Wasenm ¨uller. Spatial self-supervised peak learning and correlation-based evaluation of peak picking in mass spectrometry imaging. arXiv preprint arXiv:2603.10487, 2026. 1, 2, 5, 6

  42. [42]

    Pytorch image models.https : / / github

    Ross Wightman. Pytorch image models.https : / / github . com / rwightman / pytorch - image - models, 2019. 5

  43. [43]

    Exims: an improved data analysis pipeline based on a new peak picking method for exploring imaging mass spectrometry data.Bioinformat- ics, 31(19):3198–3206, 2015

    Chalini D Wijetunge, Isaam Saeed, Berin A Boughton, Jef- frey M Spraggins, Richard M Caprioli, Antony Bacic, Ute Roessner, and Saman K Halgamuge. Exims: an improved data analysis pipeline based on a new peak picking method for exploring imaging mass spectrometry data.Bioinformat- ics, 31(19):3198–3206, 2015. 2

  44. [44]

    Con- vnext v2: Co-designing and scaling convnets with masked autoencoders

    Sanghyun Woo, Shoubhik Debnath, Ronghang Hu, Xinlei Chen, Zhuang Liu, In So Kweon, and Saining Xie. Con- vnext v2: Co-designing and scaling convnets with masked autoencoders. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 16133– 16142, 2023. 7

  45. [45]

    Cutmix: Regu- larization strategy to train strong classifiers with localizable features

    Sangdoo Yun, Dongyoon Han, Seong Joon Oh, Sanghyuk Chun, Junsuk Choe, and Youngjoon Yoo. Cutmix: Regu- larization strategy to train strong classifiers with localizable features. InProceedings of the IEEE/CVF international con- ference on computer vision, pages 6023–6032, 2019. 4