pith. sign in

arxiv: 2602.22822 · v2 · submitted 2026-02-26 · 💻 cs.AI · cs.LG

FlexMS is a flexible framework for benchmarking deep learning-based mass spectrum prediction tools in metabolomics

Pith reviewed 2026-05-15 19:18 UTC · model grok-4.3

classification 💻 cs.AI cs.LG
keywords benchmark frameworkmass spectrum predictiondeep learningmetabolomicsmodel evaluationtransfer learningretrieval benchmarks
0
0 comments X

The pith

FlexMS creates a benchmark framework for constructing and evaluating diverse deep learning architectures to predict mass spectra.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents FlexMS as a flexible benchmark for mass spectrum prediction in metabolomics. It lets users build many different model combinations dynamically and test them on public datasets with several performance metrics. The authors examine how dataset diversity, hyperparameters like learning rate and sparsity, pretraining, metadata choices, and transfer learning shape results. These analyses give concrete advice on picking models and include retrieval tests that mimic real molecule identification tasks.

Core claim

FlexMS is a benchmark framework for mass spectrum prediction that supports the dynamic construction of numerous distinct combinations of model architectures while assessing their performance on preprocessed public datasets using different metrics, providing insights into influencing factors like structural diversity, hyperparameters, pretraining effects, metadata ablation settings, and cross-domain transfer learning analysis.

What carries the argument

FlexMS framework supporting dynamic construction of model architecture combinations and performance evaluation on datasets via multiple metrics.

If this is right

  • Insights emerge on how dataset structural diversity, learning rate, data sparsity, pretraining, metadata ablation, and transfer learning affect model performance.
  • The framework supplies practical guidance for selecting suitable models for mass spectrum tasks.
  • Retrieval benchmarks simulate real identification scenarios by scoring matches against predicted spectra.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The modular design could support rapid testing of hybrid architectures that combine elements from multiple existing approaches.
  • Standardized benchmarks like this might reduce redundant experiments when new prediction models appear in the literature.
  • Extending the framework to incorporate private or proprietary spectra collections could reveal whether public-data insights generalize to laboratory settings.

Load-bearing premise

The preprocessed public datasets and chosen evaluation metrics sufficiently represent real-world metabolomics identification challenges and that dynamic model combinations will yield practically useful performance insights.

What would settle it

Applying FlexMS to a fresh collection of experimental spectra from an unseen metabolomics source and observing no statistically significant performance gaps across tested model combinations would undermine the claimed value of the flexible benchmarking approach.

read the original abstract

The identification and property prediction of chemical molecules is of central importance in the advancement of drug discovery and material science, where the tandem mass spectrometry technology gives valuable fragmentation cues in the form of mass-to-charge ratio peaks. However, the lack of experimental spectra hinders the attachment of each molecular identification, and thus urges the establishment of prediction approaches for computational models. Deep learning models appear promising for predicting molecular structure spectra, but overall assessment remains challenging as a result of the heterogeneity in methods and the lack of well-defined benchmarks. To address this, our contribution is the creation of benchmark framework FlexMS for constructing and evaluating diverse model architectures in mass spectrum prediction. With its easy-to-use flexibility, FlexMS supports the dynamic construction of numerous distinct combinations of model architectures, while assessing their performance on preprocessed public datasets using different metrics. In this paper, we provide insights into factors influencing performance, including the structural diversity of datasets, hyperparameters like learning rate and data sparsity, pretraining effects, metadata ablation settings and cross-domain transfer learning analysis. This provides practical guidance in choosing suitable models. Moreover, retrieval benchmarks simulate practical identification scenarios and score potential matches based on predicted spectra.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces FlexMS, a flexible benchmark framework for constructing and evaluating diverse deep learning model architectures for mass spectrum prediction in metabolomics. It supports dynamic combinations of architectures, evaluates them on preprocessed public datasets using multiple metrics, and reports insights on factors including learning rate, data sparsity, pretraining, metadata ablation, cross-domain transfer learning, and retrieval benchmarks for identification scenarios.

Significance. If the framework is robustly implemented with reproducible code and the reported insights are backed by concrete, validated results, FlexMS could help standardize benchmarking in a heterogeneous field and provide practical guidance for model selection. The emphasis on flexibility in architecture combinations and retrieval benchmarks is a positive contribution toward addressing the lack of well-defined evaluation standards.

major comments (2)
  1. [Abstract] Abstract: the claim that the framework 'provides insights into factors influencing performance' and 'practical guidance in choosing suitable models' cannot be assessed because the abstract (and available text) supplies no concrete quantitative results, error bars, specific findings, or validation details on any of the listed factors.
  2. [Framework evaluation sections] Framework evaluation sections: the practical guidance on hyperparameters, pretraining, and transfer learning is load-bearing only if the preprocessed public datasets preserve key real-world difficulties (instrument-specific noise, adduct distributions, lab-varying sparsity). If preprocessing steps systematically reduce these challenges, performance differences across model combinations may be benchmark artifacts rather than generalizable signals.
minor comments (2)
  1. [Abstract] Abstract: the title and opening sentence could more clearly distinguish the contribution as a benchmarking framework rather than a new prediction method.
  2. [Throughout] Notation and terminology: ensure consistent use of terms such as 'metadata ablation' and 'cross-domain transfer' with explicit definitions or references on first use.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript introducing FlexMS. We address each major comment point by point below and indicate the revisions we will make.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the claim that the framework 'provides insights into factors influencing performance' and 'practical guidance in choosing suitable models' cannot be assessed because the abstract (and available text) supplies no concrete quantitative results, error bars, specific findings, or validation details on any of the listed factors.

    Authors: We agree that the abstract should include concrete quantitative results to make the claims directly assessable. Although the full manuscript reports specific findings on factors such as learning rate effects, data sparsity, pretraining, metadata ablation, and retrieval accuracies in the evaluation sections, the abstract is currently high-level. In the revised manuscript we will update the abstract to summarize key quantitative results, including performance metrics with error bars from multiple runs and specific insights on the listed factors. revision: yes

  2. Referee: [Framework evaluation sections] Framework evaluation sections: the practical guidance on hyperparameters, pretraining, and transfer learning is load-bearing only if the preprocessed public datasets preserve key real-world difficulties (instrument-specific noise, adduct distributions, lab-varying sparsity). If preprocessing steps systematically reduce these challenges, performance differences across model combinations may be benchmark artifacts rather than generalizable signals.

    Authors: We acknowledge the importance of ensuring the preprocessed datasets retain real-world characteristics. The preprocessing follows standard metabolomics practices on public datasets (e.g., GNPS) and is intended to retain instrument noise patterns, adduct distributions, and sparsity variations. In the revision we will expand the methods and evaluation sections with explicit statistics comparing pre- and post-preprocessing distributions for adducts and sparsity, plus a brief discussion of limitations. We will not add entirely new raw-data experiments in this revision as they fall outside the current scope, but the added details will clarify the benchmark's fidelity. revision: partial

Circularity Check

0 steps flagged

No circularity: benchmark framework paper introduces tool without self-referential derivations

full rationale

The manuscript describes the creation of the FlexMS software framework for constructing and evaluating mass spectrum prediction models on preprocessed public datasets. Its central contribution is the tool's flexibility in supporting dynamic model combinations and reporting observational insights on hyperparameters, pretraining, and transfer learning. No equations, fitted parameters renamed as predictions, or load-bearing self-citations appear in the derivation chain. The work is self-contained as an engineering contribution; performance claims rest on external public datasets and standard metrics rather than reducing to the framework's own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The contribution is a software framework rather than a theoretical derivation, so it rests on standard machine-learning assumptions about supervised training and data preprocessing.

axioms (1)
  • standard math Standard supervised deep learning training procedures and evaluation metrics apply to mass spectrum prediction
    The framework description assumes typical ML pipelines without stating novel axioms.

pith-pipeline@v0.9.0 · 5516 in / 1080 out tokens · 55576 ms · 2026-05-15T19:18:38.098936+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

57 extracted references · 57 canonical work pages · 5 internal anchors

  1. [1]

    Nature 537(7620), 347–355 (2016)

    Aebersold, R., Mann, M.: Mass-spectrometric exploration of proteome structure and function. Nature 537(7620), 347–355 (2016)

  2. [2]

    John Wiley & Sons, Chichester, England (2007)

    De Hoffmann, E., Stroobant, V.: Mass Spectrometry: Principles and Applications. John Wiley & Sons, Chichester, England (2007)

  3. [3]

    Mass spectrometry reviews37(4), 513–532 (2018)

    Kind, T., Tsugawa, H., Cajka, T., Ma, Y., Lai, Z., Mehta, S.S., Wohlgemuth, G., Barupal, D.K., Showalter, M.R., Arita, M.,et al.: Identification of small molecules using accurate mass ms/ms search. Mass spectrometry reviews37(4), 513–532 (2018)

  4. [4]

    Nature methods16(4), 299–302 (2019)

    D¨ uhrkop, K., Fleischauer, M., Ludwig, M., Aksenov, A.A., Melnik, A.V., Meusel, M., Dorrestein, P.C., Rousu, J., B¨ ocker, S.: Sirius 4: a rapid tool for turning tandem mass spectra into metabolite structure information. Nature methods16(4), 299–302 (2019)

  5. [5]

    Nucleic acids research46(D1), 608–617 (2018)

    Wishart, D.S., Feunang, Y.D., Marcu, A., Guo, A.C., Liang, K., V´ azquez-Fresno, R., Sajed, T., John- son, D., Li, C., Karu, N.,et al.: Hmdb 4.0: the human metabolome database for 2018. Nucleic acids research46(D1), 608–617 (2018)

  6. [6]

    Journal of mass spectrometry45(7), 703–714 (2010)

    Horai, H., Arita, M., Kanaya, S., Nihei, Y., Ikeda, T., Suwa, K., Ojima, Y., Tanaka, K., Tanaka, S., Aoshima, K.,et al.: Massbank: a public repository for sharing mass spectral data for life sciences. Journal of mass spectrometry45(7), 703–714 (2010)

  7. [7]

    Journal of chemical information and modeling59(8), 3370–3388 (2019)

    Yang, K., Swanson, K., Jin, W., Coley, C., Eiden, P., Gao, H., Guzman-Perez, A., Hopper, T., Kelley, B., Mathea, M.,et al.: Analyzing learned molecular representations for property prediction. Journal of chemical information and modeling59(8), 3370–3388 (2019)

  8. [8]

    BMC bioinformatics19(Suppl 19), 526 (2018)

    Hirohara, M., Saito, Y., Koda, Y., Sato, K., Sakakibara, Y.: Convolutional neural network based on smiles representation of compounds for detecting chemical motif. BMC bioinformatics19(Suppl 19), 526 (2018)

  9. [9]

    ChemBERTa: large -scale self -supervised pretraining fo r molecular property prediction

    Chithrananda, S., Grand, G., Ramsundar, B.: Chemberta: large-scale self-supervised pretraining for molecular property prediction. arXiv preprint arXiv:2010.09885 (2020)

  10. [10]

    Fabian, T

    Fabian, B., Edlich, T., Gaspar, H., Segler, M., Meyers, J., Fiscato, M., Ahmed, M.: Molecular representation learning with language models and domain-relevant auxiliary tasks. arXiv preprint arXiv:2011.13230 (2020)

  11. [11]

    Analytical Chemistry96(8), 3419–3428 (2024)

    Goldman, S., Li, J., Coley, C.W.: Generating molecular fragmentation graphs with autoregressive neural networks. Analytical Chemistry96(8), 3419–3428 (2024)

  12. [12]

    Advances in Neural Information Processing Systems36, 48548–48572 (2023)

    Goldman, S., Bradshaw, J., Xin, J., Coley, C.: Prefix-tree decoding for predicting mass spectra from molecules. Advances in Neural Information Processing Systems36, 48548–48572 (2023)

  13. [13]

    NIST/EPA/NIH Mass Spectral Library and NIST Tandem Mass Spectral Library (2020)

    Standards, N.I., Technology: NIST 20 Mass Spectral Library (NIST/EPA/NIH). NIST/EPA/NIH Mass Spectral Library and NIST Tandem Mass Spectral Library (2020)

  14. [14]

    Mass Spectrometry3(Special Issue 2), 0033–0033 (2014)

    Ridder, L., Hooft, J.J.J., Verhoeven, S.: Automatic compound annotation from mass spectrometry data using magma. Mass Spectrometry3(Special Issue 2), 0033–0033 (2014)

  15. [15]

    Nature Machine Intelligence6(4), 404–416 (2024)

    Young, A., R¨ ost, H., Wang, B.: Tandem mass spectrum prediction for small molecules using graph transformers. Nature Machine Intelligence6(4), 404–416 (2024)

  16. [16]

    Ying, C., Cai, T., Luo, S., Zheng, S., Ke, G., He, D., Shen, Y., Liu, T.-Y.: Do transformers really perform badly for graph representation? Advances in Neural Information Processing Systems34, 28877–28888 (2021)

  17. [17]

    Metabolomics11(1), 98–110 (2015)

    Allen, F., Greiner, R., Wishart, D.: Competitive fragmentation modeling of esi-ms/ms spectra for putative metabolite identification. Metabolomics11(1), 98–110 (2015)

  18. [18]

    Journal of cheminformatics8(1), 3 (2016) 22

    Ruttkies, C., Schymanski, E.L., Wolf, S., Hollender, J., Neumann, S.: Metfrag relaunched: incorporating strategies beyond in silico fragmentation. Journal of cheminformatics8(1), 3 (2016) 22

  19. [19]

    ACS central science5(4), 700–708 (2019)

    Wei, J.N., Belanger, D., Adams, R.P., Sculley, D.: Rapid prediction of electron–ionization mass spectrometry using neural networks. ACS central science5(4), 700–708 (2019)

  20. [20]

    Nature biotechnology34(8), 828–837 (2016)

    Wang, M., Carver, J.J., Phelan, V.V., Sanchez, L.M., Garg, N., Peng, Y., Nguyen, D.D., Watrous, J., Kapono, C.A., Luzzatto-Knaan, T.,et al.: Sharing and community curation of mass spectrometry data with global natural products social molecular networking. Nature biotechnology34(8), 828–837 (2016)

  21. [21]

    Nature biotechnology39(4), 462–471 (2021)

    D¨ uhrkop, K., Nothias, L.-F., Fleischauer, M., Reher, R., Ludwig, M., Hoffmann, M.A., Petras, D., Gerwick, W.H., Rousu, J., Dorrestein, P.C.,et al.: Systematic classification of unknown metabolites using high-resolution fragmentation mass spectra. Nature biotechnology39(4), 462–471 (2021)

  22. [22]

    Advances in Neural Information Processing Systems37, 110010–110027 (2024)

    Bushuiev, R., Bushuiev, A., Jonge, N., Young, A., Kretschmer, F., Samusevich, R., Heirman, J., Wang, F., Zhang, L., D¨ uhrkop, K.,et al.: Massspecgym: A benchmark for the discovery and identification of molecules. Advances in Neural Information Processing Systems37, 110010–110027 (2024)

  23. [23]

    Journal of cheminformatics9(1), 22 (2017)

    Schymanski, E.L., Ruttkies, C., Krauss, M., Brouard, C., Kind, T., D¨ uhrkop, K., Allen, F., Vaniya, A., Verdegem, D., B¨ ocker, S.,et al.: Critical assessment of small molecule identification 2016: automated methods. Journal of cheminformatics9(1), 22 (2017)

  24. [24]

    Current opinion in chemical biology3(3), 342– 349 (1999)

    Mason, J.S., Hermsmeier, M.A.: Diversity assessment. Current opinion in chemical biology3(3), 342– 349 (1999)

  25. [25]

    Journal of chemical information and modeling50(5), 742–754 (2010)

    Rogers, D., Hahn, M.: Extended-connectivity fingerprints. Journal of chemical information and modeling50(5), 742–754 (2010)

  26. [26]

    Bajusz, D., R´ acz, A., H´ eberger, K.: Why is tanimoto index an appropriate choice for fingerprint-based similarity calculations? Journal of cheminformatics7(1), 20 (2015)

  27. [27]

    Chemical science9(2), 513–530 (2018)

    Wu, Z., Ramsundar, B., Feinberg, E.N., Gomes, J., Geniesse, C., Pappu, A.S., Leswing, K., Pande, V.: Moleculenet: a benchmark for molecular machine learning. Chemical science9(2), 513–530 (2018)

  28. [28]

    Wiley statsref: Statistics reference online (2014)

    Berger, V.W., Zhou, Y.: Kolmogorov–smirnov test: Overview. Wiley statsref: Statistics reference online (2014)

  29. [29]

    Journal of chemical information and modeling47(1), 47–58 (2007)

    Schuffenhauer, A., Ertl, P., Roggo, S., Wetzel, S., Koch, M.A., Waldmann, H.: The scaffold tree- visu- alization of the scaffold universe by hierarchical scaffold classification. Journal of chemical information and modeling47(1), 47–58 (2007)

  30. [30]

    Nature Methods18(12), 1524–1531 (2021)

    Li, Y., Kind, T., Folz, J., Vaniya, A., Mehta, S.S., Fiehn, O.: Spectral entropy outperforms ms/ms dot product similarity for small-molecule compound identification. Nature Methods18(12), 1524–1531 (2021)

  31. [31]

    Xu, K., Hu, W., Leskovec, J., Jegelka, S.: How powerful are graph neural networks? arXiv preprint arXiv:1810.00826 (2018)

  32. [32]

    Graph Attention Networks

    Veliˇ ckovi´ c, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., Bengio, Y.: Graph attention networks. arXiv preprint arXiv:1710.10903 (2017)

  33. [33]

    Journal of Machine learning research7(Jan), 1–30 (2006)

    Demˇ sar, J.: Statistical comparisons of classifiers over multiple data sets. Journal of Machine learning research7(Jan), 1–30 (2006)

  34. [34]

    Bioinformatics39(6), 354 (2023)

    Hong, Y., Li, S., Welch, C.J., Tichy, S., Ye, Y., Tang, H.: 3dmolms: prediction of tandem mass spectra from 3d molecular conformations. Bioinformatics39(6), 354 (2023)

  35. [35]

    Metabolites10(6), 243 (2020)

    Liebal, U.W., Phan, A.N., Sudhakar, M., Raman, K., Blank, L.M.: Machine learning applications for mass spectrometry-based metabolomics. Metabolites10(6), 243 (2020)

  36. [36]

    IEEE transactions on neural networks and learning systems32(1), 4–24 (2020)

    Wu, Z., Pan, S., Chen, F., Long, G., Zhang, C., Yu, P.S.: A comprehensive survey on graph neural networks. IEEE transactions on neural networks and learning systems32(1), 4–24 (2020)

  37. [37]

    Semi-Supervised Classification with Graph Convolutional Networks

    Kipf, T.: Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016) 23

  38. [38]

    Xia, J., Zhao, C., Hu, B., Gao, Z., Tan, C., Liu, Y., Li, S., Li, S.Z.: Mole-bert: Rethinking pre-training graph neural networks for molecules (2023)

  39. [39]

    Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (long and Short Papers), pp. 4171–4186 (2019)

  40. [40]

    Advances in neural information processing systems30(2017)

    Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems30(2017)

  41. [41]

    Advances in neural information processing systems33, 12559–12571 (2020)

    Rong, Y., Bian, Y., Xu, T., Xie, W., Wei, Y., Huang, W., Huang, J.: Self-supervised graph transformer on large-scale molecular data. Advances in neural information processing systems33, 12559–12571 (2020)

  42. [42]

    Analytical Chemistry97(31), 17058–17066 (2025)

    Liu, B., Tang, Z., Huan, T.: Adduct-induced variability in tandem mass spectrometry. Analytical Chemistry97(31), 17058–17066 (2025)

  43. [43]

    Molecules28(5), 2061 (2023)

    Deschamps, E., Calabrese, V., Schmitz, I., Hubert-Roux, M., Castagnos, D., Afonso, C.: Advances in ultra-high-resolution mass spectrometry for pharmaceutical analysis. Molecules28(5), 2061 (2023)

  44. [44]

    Proceedings of the IEEE109(1), 43–76 (2020)

    Zhuang, F., Qi, Z., Duan, K., Xi, D., Zhu, Y., Zhu, H., Xiong, H., He, Q.: A comprehensive survey on transfer learning. Proceedings of the IEEE109(1), 43–76 (2020)

  45. [45]

    Journal of Cheminformatics12(1), 51 (2020)

    Bento, A.P., Hersey, A., F´ elix, E., Landrum, G., Gaulton, A., Atkinson, F., Bellis, L.J., De Veij, M., Leach, A.R.: An open source chemical structure curation pipeline using rdkit. Journal of Cheminformatics12(1), 51 (2020)

  46. [46]

    Journal of chemical documentation5(2), 107–113 (1965)

    Morgan, H.L.: The generation of a unique machine description for chemical structures-a technique developed at chemical abstracts service. Journal of chemical documentation5(2), 107–113 (1965)

  47. [47]

    Journal of chemical information and computer sciences42(6), 1273–1280 (2002)

    Durant, J.L., Leland, B.A., Henry, D.R., Nourse, J.G.: Reoptimization of mdl keys for use in drug discovery. Journal of chemical information and computer sciences42(6), 1273–1280 (2002)

  48. [48]

    Journal of chemical information and modeling46(1), 208–220 (2006)

    Stiefl, N., Watson, I.A., Baumann, K., Zaliani, A.: Erg: 2d pharmacophore descriptions for scaffold hopping. Journal of chemical information and modeling46(1), 208–220 (2006)

  49. [49]

    http://www

    James, C.A.: Daylight theory manual. http://www. daylight. com/dayhtml/doc/theory/theory. toc. html (2004)

  50. [50]

    Journal of chemical information and modeling56(2), 390–398 (2016)

    Helal, K.Y., Maciejewski, M., Gregori-Puigjane, E., Glick, M., Wassermann, A.M.: Public domain hts fingerprints: design and evaluation of compound bioactivity profiles from pubchem’s bioassay repository. Journal of chemical information and modeling56(2), 390–398 (2016)

  51. [51]

    In: NeurIPS Learning Meaningful Representation of Life Workshop (2019)

    Huang, K., Xiao, C., Glass, L., Sun, J.: Explainable substructure partition fingerprint for protein, drug, and more. In: NeurIPS Learning Meaningful Representation of Life Workshop (2019)

  52. [52]

    Deep Graph Library: A Graph-Centric, Highly-Performant Package for Graph Neural Networks

    Wang, M., Zheng, D., Ye, Z., Gan, Q., Li, M., Song, X., Zhou, J., Ma, C., Yu, L., Gai, Y., et al.: Deep graph library: A graph-centric, highly-performant package for graph neural networks. arXiv preprint arXiv:1909.01315 (2019)

  53. [53]

    Journal of medicinal chemistry63(16), 8749–8760 (2019)

    Xiong, Z., Wang, D., Liu, X., Zhong, F., Wan, X., Li, X., Li, Z., Luo, X., Chen, K., Jiang, H.,et al.: Pushing the boundaries of molecular representation for drug discovery with the graph attention mechanism. Journal of medicinal chemistry63(16), 8749–8760 (2019)

  54. [54]

    Fast Graph Representation Learning with PyTorch Geometric

    Fey, M., Lenssen, J.E.: Fast graph representation learning with pytorch geometric. arXiv preprint arXiv:1903.02428 (2019)

  55. [55]

    Bioinformatics36(22-23), 5545–5547 (2020)

    Huang, K., Fu, T., Glass, L.M., Zitnik, M., Xiao, C., Sun, J.: Deeppurpose: a deep learning library for drug–target interaction prediction. Bioinformatics36(22-23), 5545–5547 (2020)

  56. [56]

    Journal of the Franklin Institute334(2), 307–318 (1997) 24

    Men´ endez, M.L., Pardo, J.A., Pardo, L., Pardo, M.d.C.: The jensen-shannon divergence. Journal of the Franklin Institute334(2), 307–318 (1997) 24

  57. [57]

    The annals of mathematical statistics 22(1), 79–86 (1951) 25 Appendix A Supplementary Fig

    Kullback, S., Leibler, R.A.: On information and sufficiency. The annals of mathematical statistics 22(1), 79–86 (1951) 25 Appendix A Supplementary Fig. A1Supplementary Figure 1. Performance evaluation (cosine similarity and JS divergence) of various embedders on the MassSpecGym dataset in data-limited regimes. Training sets were reduced to half and quarte...