pith. sign in

arxiv: 2606.06224 · v2 · pith:NRAXXCGYnew · submitted 2026-06-04 · 💻 cs.CV · cs.LG

Symb-xMIL: Symbolic Explanations for Multiple Instance Learning in Digital Pathology

Pith reviewed 2026-06-28 02:27 UTC · model grok-4.3

classification 💻 cs.CV cs.LG
keywords multiple instance learningexplainable AIdigital pathologysymbolic explanationslogical ruleshistopathologymodel interpretabilitypost-hoc explanation
0
0 comments X

The pith

Symb-xMIL measures how well MIL model predictions in pathology align with logical rules combining tissue features.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents Symb-xMIL as a post-hoc method that goes beyond heatmaps by scoring how closely a multiple instance learning model's output matches human-readable logical combinations of features, such as AND, OR, and NOT relationships. This alignment is intended to expose the semantic patterns and decision logic that heatmaps alone cannot show, particularly when predictions depend on interactions across tissue regions. The authors test the approach on synthetic data where ground-truth rules are known, on a tumor detection task where it identifies varied patterns and model errors, and on an HPV prediction task where aligned rules improve survival grouping beyond standard labels.

Core claim

Symb-xMIL quantifies the alignment between a MIL model's predictions and candidate logical rules over input features, recovering known rules on synthetic data, revealing heterogeneous decision patterns and errors on tumor detection, and refining survival stratification on HPV-related head and neck cancer data.

What carries the argument

Alignment scores that compare MIL model behavior against logical (AND/OR/NOT) combinations of tissue features.

If this is right

  • The framework recovers ground-truth logical rules on synthetic MIL data.
  • Best-aligned rules on tumor detection data expose heterogeneous decision patterns and hidden model errors.
  • On the TCGA-HNSCC HPV task, aligned rules refine patient survival stratification beyond HPV status alone.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Clinicians could use the extracted rules to check whether a model relies on medically plausible feature combinations rather than artifacts.
  • The same alignment approach might be applied to other MIL tasks outside pathology to surface unexpected logical dependencies.
  • If alignment scores prove stable across model retrainings, they could serve as a monitoring signal for concept drift in deployed pathology models.

Load-bearing premise

That the degree of alignment between model predictions and candidate logical rules accurately reflects the model's actual internal decision process.

What would settle it

Failure of the method to recover the injected ground-truth logical rules when applied to the synthetic MIL datasets described in the paper.

Figures

Figures reproduced from arXiv: 2606.06224 by Andreas Mock, Julius Hense, Klaus-Robert M\"uller, Mina Jamshidi Idaji, Niklas Preni{\ss}l, Thomas Schnake, Yanqing Luo.

Figure 1
Figure 1. Figure 1: Overview of the Symb-xMIL pipeline. Patches from a WSI are assigned semantic labels and grouped into a feature set of concepts (1–2). Multi-order subsets and a query space of logical rules are constructed over these concepts (2). The MIL model is evaluated on each subset, and the resulting subset responses are compared against query-induced reference patterns (3–4). The resulting alignment scores form the … view at source ↗
Figure 2
Figure 2. Figure 2: PCA visualization of the Camelyon16 symbolic representation space. Each point represents one tumor slide and is colored by its best-query group; groups with fewer than three are labeled "Others". Markers indicate cluster centers, and black outlines denote macro-metastasis slides. See also Table S3 for more information about the clusters. 7 [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Symb-xMIL clustering identifies HPV-like sample groups with prognostic relevance on TCGA-HNSCC. (a) Clinical HPV status stratifies patient survival in Kaplan-Meier analysis logrank p = 0.06. (b) t-SNE projection of the symbolic representation shows clusters used to assign samples to HPV-like and HPV-unlike groups based on HPV+ enrichment. (c) The symbolic-space￾derived HPV-like/unlike partition also strati… view at source ↗
Figure 2
Figure 2. Figure 2: For each cluster, we report the number of slides (N), counts of macro and micro metastasis cases, and the most frequent best-matching symbolic queries within the cluster. Cluster N Macro Micro Top Rules (count) C0: Tumor + Context 68 45 23 Mix ∨ Tumor (63), Mix ∨ (Normal ∧ Tumor) (1), Normal ∨ Tumor (1) C1: Context-driven 41 4 37 ¬Norm ∨ Mix (19), Mix (15), Normal ∧ Mix (1) C2: Tumor-dominant 26 25 1 ¬Norm… view at source ↗
read the original abstract

Explanations of multiple instance learning (MIL) models are widely used for validation and discovery in digital histopathology. Existing methods primarily rely on heatmaps that highlight influential regions but do not explain how evidence from different tissue regions is combined to produce a prediction. This limits interpretability, especially when decisions depend on interactions between tissue features. We introduce Symbolic explainable MIL (Symb-xMIL), a post-hoc explanation framework that quantifies how a MIL model's behavior aligns with human-readable decision rules, expressed as logical relationships (e.g., AND, OR, NOT) between input features. These alignment scores reveal semantic patterns underlying the model's predictions. We evaluate Symb-xMIL on synthetic and real-world histopathology datasets. On synthetic MIL data, Symb-xMIL reliably recovers ground-truth logical rules. In a clinical tumor detection task, the best-aligned rules uncover heterogeneous decision patterns and expose hidden model errors. On an HPV-prediction task on TCGA-HNSCC, a cohort of head and neck cancer, our framework refines patient survival stratification beyond HPV status with potential clinical relevance. Overall, Symb-xMIL extends MIL explainability beyond visual attribution toward structured, rule-based reasoning, enabling more transparent and semantically grounded interpretation of model predictions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper introduces Symb-xMIL, a post-hoc explanation framework for multiple instance learning (MIL) models in digital pathology. It generates candidate logical rules (AND/OR/NOT combinations of input features) and computes alignment scores with the MIL model's predictions to provide structured, rule-based interpretations beyond visual heatmaps. On synthetic MIL data the method recovers planted ground-truth rules; on a clinical tumor detection task the best-aligned rules reveal heterogeneous decision patterns and hidden model errors; on TCGA-HNSCC HPV-prediction data the framework refines patient survival stratification beyond HPV status alone.

Significance. If the alignment scores can be shown to reflect the model's internal decision logic rather than output correlation, the work would meaningfully extend MIL explainability in histopathology from visual attribution to human-readable logical reasoning. This could support more reliable model validation, error detection, and clinically relevant stratification. The synthetic recovery result is a clear strength; the real-data claims hinge on the untested fidelity assumption.

major comments (2)
  1. [Evaluation on real-world histopathology datasets] The central claim that Symb-xMIL yields 'semantically grounded interpretation of model predictions' requires that high alignment scores indicate the MIL model internally combines instance evidence according to the logical rule. No interventions, gradient comparisons, or ablations against the MIL aggregator are described to test this; the scores could reflect output correlation without internal fidelity. This assumption is load-bearing for all real-data claims.
  2. [Synthetic MIL data evaluation] The synthetic-data experiment recovers planted rules, but the manuscript provides no quantitative details on how the MIL aggregator is configured or whether the alignment metric is compared against alternative explanation methods (e.g., attention weights or SHAP). Without these controls it is unclear whether the recovery is specific to the proposed alignment procedure.
minor comments (1)
  1. [Method overview] The abstract and evaluation descriptions do not specify the exact feature set used to construct logical rules or the search procedure over rule space; these details are needed for reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We address each of the major comments point by point below.

read point-by-point responses
  1. Referee: [Evaluation on real-world histopathology datasets] The central claim that Symb-xMIL yields 'semantically grounded interpretation of model predictions' requires that high alignment scores indicate the MIL model internally combines instance evidence according to the logical rule. No interventions, gradient comparisons, or ablations against the MIL aggregator are described to test this; the scores could reflect output correlation without internal fidelity. This assumption is load-bearing for all real-data claims.

    Authors: We agree that the manuscript does not include direct tests of internal fidelity such as interventions or ablations on the MIL aggregator. Symb-xMIL is a post-hoc method that quantifies alignment between logical rules and the model's output predictions. This provides a measure of how the model's behavior (i.e., its predictions) aligns with human-readable rules, which is valuable for interpretation even if it does not directly probe the internal computations of the aggregator. However, we acknowledge that this leaves open the possibility of output correlation without deeper internal match. In the revised manuscript, we will add a clarification in the methods and discussion sections to precisely define the scope of the alignment scores and include a limitations paragraph on this point. revision: partial

  2. Referee: [Synthetic MIL data evaluation] The synthetic-data experiment recovers planted rules, but the manuscript provides no quantitative details on how the MIL aggregator is configured or whether the alignment metric is compared against alternative explanation methods (e.g., attention weights or SHAP). Without these controls it is unclear whether the recovery is specific to the proposed alignment procedure.

    Authors: We appreciate this observation. The current manuscript describes the synthetic setup at a high level but indeed lacks specific quantitative details on the MIL aggregator (such as the type of pooling or attention mechanism used) and does not include comparisons to other explanation methods. We will revise the methods and results sections to include these details, specifying the aggregator configuration, and add a comparison against attention weights to demonstrate the added value of the symbolic alignment approach. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained

full rationale

The paper presents Symb-xMIL as a post-hoc framework that generates candidate logical rules over features and computes alignment scores with MIL model outputs. On synthetic data the method recovers planted rules via direct comparison, which is an external validation rather than a definitional reduction. Real-data claims rest on applying the same alignment computation to clinical datasets and observing patterns, without any fitted parameter being relabeled as a prediction or any self-citation serving as the sole justification for the core alignment metric. No equations or steps reduce the claimed outputs to the inputs by construction; the framework is independent of the target results it is applied to.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no information on free parameters, axioms, or invented entities.

pith-pipeline@v0.9.1-grok · 5778 in / 1074 out tokens · 66008 ms · 2026-06-28T02:27:52.317139+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

51 extracted references · 1 canonical work pages

  1. [1]

    Human papillomavirus and survival of patients with oropharyngeal cancer.New England Journal of Medicine, 363(1):24–35, 2010

    K Kian Ang, Jonathan Harris, Richard Wheeler, Randal Weber, David I Rosenthal, Phuc Felix Nguyen- Tân, William H Westra, Christine H Chung, Richard C Jordan, Charles Lu, et al. Human papillomavirus and survival of patients with oropharyngeal cancer.New England Journal of Medicine, 363(1):24–35, 2010

  2. [2]

    Graphtrail: Translating gnn pre- dictions into human-interpretable logical rules.Advances in Neural Information Processing Systems, 37: 123443–123470, 2024

    Burouj Armgaan, Manthan Dalmia, Sourav Medya, and Sayan Ranu. Graphtrail: Translating gnn pre- dictions into human-interpretable logical rules.Advances in Neural Information Processing Systems, 37: 123443–123470, 2024

  3. [3]

    Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer.Jama, 318(22):2199–2210, 2017

    Babak Ehteshami Bejnordi, Mitko Veta, Paul Johannes Van Diest, Bram Van Ginneken, Nico Karssemei- jer, Geert Litjens, Jeroen AWM Van Der Laak, Meyke Hermsen, Quirine F Manson, Maschenka Balken- hol, et al. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer.Jama, 318(22):2199–2210, 2017

  4. [4]

    Value dividends, the harsanyi set and extensions, and the proportional harsanyi solution

    Manfred Besner. Value dividends, the harsanyi set and extensions, and the proportional harsanyi solution. International Journal of Game Theory, 49(3):851–873, 2020

  5. [5]

    An aggregation of aggregation methods in computational pathology.Medical Image Analysis, 2023

    Mohsin Bilal, Robert Jewsbury, Ruoyu Wang, Hammam M AlGhamdi, Amina Asif, Mark Eastwood, and Nasir Rajpoot. An aggregation of aggregation methods in computational pathology.Medical Image Analysis, 2023

  6. [6]

    Morphological and molecular breast cancer profiling through explainable machine learning.Nature Machine Intelligence, 3 (4):355–366, 2021

    Alexander Binder, Michael Bockmayr, Miriam Hägele, Stephan Wienert, Daniel Heim, Katharina Hell- weg, Masaru Ishii, Albrecht Stenzinger, Andreas Hocke, Carsten Denkert, et al. Morphological and molecular breast cancer profiling through explainable machine learning.Nature Machine Intelligence, 3 (4):355–366, 2021

  7. [7]

    Prognostic factors in head and neck cancer: a 10-year retrospective analysis in a single-institution in italy.Acta Otorhinolaryn- gologica Italica, 37(6):458, 2017

    Gabriella Cadoni, Luca Giraldi, Livia Petrelli, Manlio Pandolfini, Monica Giuliani, Gaetano Paludetti, Roberta Pastorino, Emanuele Leoncini, Dario Arzani, Giovanni Almadori, et al. Prognostic factors in head and neck cancer: a 10-year retrospective analysis in a single-institution in italy.Acta Otorhinolaryn- gologica Italica, 37(6):458, 2017

  8. [8]

    Linghan Cai, Shenjin Huang, Ye Zhang, Jinpeng Lu, and Yongbing Zhang. Attrimil: Revisiting attention- based multiple instance learning for whole-slide pathological image classification from a perspective of instance attributes.Medical Image Analysis, 103:103631, 2025. 10

  9. [9]

    Clinical-grade computational pathology using weakly supervised deep learning on whole slide images.Nature medicine, 25(8):1301–1309, 2019

    Gabriele Campanella, Matthew G Hanna, Luke Geneslaw, Allen Miraflor, Vitor Werneck Krauss Silva, Klaus J Busam, Edi Brogi, Victor E Reuter, David S Klimstra, and Thomas J Fuchs. Clinical-grade computational pathology using weakly supervised deep learning on whole slide images.Nature medicine, 25(8):1301–1309, 2019

  10. [10]

    Artificial intelligence and computational pathology.Laboratory Investi- gation, 101(4):412–422, 2021

    Miao Cui and David Y Zhang. Artificial intelligence and computational pathology.Laboratory Investi- gation, 101(4):412–422, 2021

  11. [11]

    lifelines: survival analysis in python.Journal of Open Source Software, 4(40): 1317, 2019

    Cameron Davidson-Pilon. lifelines: survival analysis in python.Journal of Open Source Software, 4(40): 1317, 2019

  12. [12]

    The mnist database of handwritten digit images for machine learning research [best of the web]

    Li Deng. The mnist database of handwritten digit images for machine learning research [best of the web]. IEEE signal processing magazine, 29(6):141–142, 2012

  13. [13]

    Solving the multiple instance prob- lem with axis-parallel rectangles.Artificial intelligence, 89(1-2):31–71, 1997

    Thomas G Dietterich, Richard H Lathrop, and Tomás Lozano-Pérez. Solving the multiple instance prob- lem with axis-parallel rectangles.Artificial intelligence, 89(1-2):31–71, 1997

  14. [14]

    Model agnostic interpretability for multiple instance learning

    Joseph Early, Christine Evers, and SArvapali Ramchurn. Model agnostic interpretability for multiple instance learning. InInternational Conference on Learning Representations, 2022. URLhttps:// openreview.net/forum?id=KSSfF5lMIAg

  15. [15]

    Building and interpreting deep similarity models.IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(3):1149–1161, 2022

    Oliver Eberle, Jochen Büttner, Florian Kräutli, Klaus-Robert Müller, Matteo Valleriani, and Grégoire Montavon. Building and interpreting deep similarity models.IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(3):1149–1161, 2022

  16. [16]

    The explainability paradox: Challenges for xai in digital pathology.Future Generation Computer Systems, 133:281–296, 2022

    Theodore Evans, Carl Orge Retzlaff, Christian Geißler, Michaela Kargl, Markus Plass, Heimo Müller, Tim-Rasmus Kiehl, Norman Zerbe, and Andreas Holzinger. The explainability paradox: Challenges for xai in digital pathology.Future Generation Computer Systems, 133:281–296, 2022

  17. [17]

    Implementation of multiple-instance learning in drug activity prediction.BMC bioinformatics, 13(Suppl 15):S3, 2012

    Gang Fu, Xiaofei Nan, Haining Liu, Ronak Y Patel, Pankaj R Daga, Yixin Chen, Dawn E Wilkins, and Robert J Doerksen. Implementation of multiple-instance learning in drug activity prediction.BMC bioinformatics, 13(Suppl 15):S3, 2012

  18. [18]

    Trans- former based multiple instance learning for wsi breast cancer classification.Biomedical Signal Processing and Control, 89:105755, 2024

    Chengyang Gao, Qiule Sun, Wen Zhu, Lizhi Zhang, Jianxin Zhang, Bin Liu, and Junxing Zhang. Trans- former based multiple instance learning for wsi breast cancer classification.Biomedical Signal Processing and Control, 89:105755, 2024

  19. [19]

    Factual and counterfactual explanations for black box decision making.IEEE Intelligent Systems, 34(6):14–23, 2019

    Riccardo Guidotti, Anna Monreale, Fosca Giannotti, Dino Pedreschi, Salvatore Ruggieri, and Franco Turini. Factual and counterfactual explanations for black box decision making.IEEE Intelligent Systems, 34(6):14–23, 2019

  20. [20]

    Resolving challenges in deep learning-based analyses of histopathological images using explanation methods.Scientific reports, 10(1):6423, 2020

    Miriam Hägele, Philipp Seegerer, Sebastian Lapuschkin, Michael Bockmayr, Wojciech Samek, Frederick Klauschen, Klaus-Robert Müller, and Alexander Binder. Resolving challenges in deep learning-based analyses of histopathological images using explanation methods.Scientific reports, 10(1):6423, 2020

  21. [21]

    Xmil: Insightful explanations for multiple instance learning in histopathology.Advances in Neural Information Processing Systems, 37: 8300–8328, 2024

    Julius Hense, Mina Jamshidi Idaji, Oliver Eberle, Thomas Schnake, Jonas Dippel, Laure Ciernik, Oliver Buchstab, Andreas Mock, Frederick Klauschen, and Klaus-Robert Müller. Xmil: Insightful explanations for multiple instance learning in histopathology.Advances in Neural Information Processing Systems, 37: 8300–8328, 2024

  22. [22]

    Digital spatial pathway mapping reveals prognostic tumor states in head and neck cancer.bioRxiv, pages 2025–11, 2025

    Julius Hense, Mina Jamshidi Idaji, Laure Ciernik, Jonas Dippel, Fatma Ersan, Maximilian Knebel, Ada Pusztai, Andrea Sendelhofert, Oliver Buchstab, Stefan Fröhling, et al. Digital spatial pathway mapping reveals prognostic tumor states in head and neck cancer.bioRxiv, pages 2025–11, 2025

  23. [23]

    Cellvit++: Energy-efficient and adaptive cell segmentation and classification using foundation models.Computer Methods and Programs in Biomedicine, page 109206, 2026

    Fabian Hörst, Moritz Rempe, Helmut Becker, Lukas Heine, Julius Keyl, and Jens Kleesiek. Cellvit++: Energy-efficient and adaptive cell segmentation and classification using foundation models.Computer Methods and Programs in Biomedicine, page 109206, 2026

  24. [24]

    Beyond attention heatmaps: How to get better explanations for multiple instance learning models in histopathology.arXiv preprint arXiv:2603.08328, 2026

    Mina Jamshidi Idaji, Julius Hense, Tom Neuhäuser, Augustin Krause, Yanqing Luo, Oliver Eberle, Thomas Schnake, Laure Ciernik, Farnoush Rezaei Jafari, Reza Vahidimajd, et al. Beyond attention heatmaps: How to get better explanations for multiple instance learning models in histopathology.arXiv preprint arXiv:2603.08328, 2026

  25. [25]

    Attention-based deep multiple instance learning

    Maximilian Ilse, Jakub Tomczak, and Max Welling. Attention-based deep multiple instance learning. In International conference on machine learning, pages 2127–2136. PMLR, 2018

  26. [26]

    Explaining explanations: Axiomatic feature interac- tions for deep networks.Journal of Machine Learning Research, 22(104):1–54, 2021

    Joseph D Janizek, Pascal Sturmfels, and Su-In Lee. Explaining explanations: Axiomatic feature interac- tions for deep networks.Journal of Machine Learning Research, 22(104):1–54, 2021. 11

  27. [27]

    Chen, Drew F.K

    Guillaume Jaume, Anurag Vaidya, Richard J. Chen, Drew F.K. Williamson, Paul Pu Liang, and Faisal Mahmood. Modeling dense multimodal interactions between biological pathways and histology for sur- vival prediction. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recogni- tion (CVPR), pages 11579–11590, June 2024

  28. [28]

    Gupta, and Prateek Prasanna

    Saarthak Kapse, Pushpak Pati, Srijan Das, Jingwei Zhang, Chao Chen, Maria Vakalopoulou, Joel Saltz, Dimitris Samaras, Rajarsi R. Gupta, and Prateek Prasanna. Si-mil: Taming deep mil for self- interpretability in gigapixel histopathology. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 11226–11237, June 2024

  29. [29]

    Explainable ai reveals clever hans effects in unsupervised learning models.Nature Machine Intelligence, 7(3):412–422, 2025

    Jacob Kauffmann, Jonas Dippel, Lukas Ruff, Wojciech Samek, Klaus-Robert Müller, and Grégoire Mon- tavon. Explainable ai reveals clever hans effects in unsupervised learning models.Nature Machine Intelligence, 7(3):412–422, 2025

  30. [30]

    Segment anything

    Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C Berg, Wan-Yen Lo, et al. Segment anything. InProceedings of the IEEE/CVF international conference on computer vision, pages 4015–4026, 2023

  31. [31]

    Toward explainable artificial intelligence for precision pathology.Annual Review of Pathology: Mechanisms of Disease, 19(1):541– 570, 2024

    Frederick Klauschen, Jonas Dippel, Philipp Keyl, Philipp Jurmeister, Michael Bockmayr, Andreas Mock, Oliver Buchstab, Maximilian Alber, Lukas Ruff, Grégoire Montavon, et al. Toward explainable artificial intelligence for precision pathology.Annual Review of Pathology: Mechanisms of Disease, 19(1):541– 570, 2024

  32. [32]

    Towards robust foundation models for digital pathology.arXiv preprint arXiv:2507.17845, 2025

    Jonah Kömen, Edwin D de Jong, Julius Hense, Hannah Marienwald, Jonas Dippel, Philip Naumann, Eric Marcus, Lukas Ruff, Maximilian Alber, Jonas Teuwen, et al. Towards robust foundation models for digital pathology.arXiv preprint arXiv:2507.17845, 2025

  33. [33]

    Unmasking clever hans predictors and assessing what machines really learn.Nature communications, 10(1):1096, 2019

    Sebastian Lapuschkin, Stephan Wäldchen, Alexander Binder, Grégoire Montavon, Wojciech Samek, and Klaus-Robert Müller. Unmasking clever hans predictors and assessing what machines really learn.Nature communications, 10(1):1096, 2019

  34. [34]

    Derivation of prognostic contextual histopathological features from whole-slide images of tumours via graph deep learning.Nature Biomedical Engineering, 6 (12):1452–1466, 2022

    Yongju Lee, Jeong Hwan Park, Sohee Oh, Kyoungseob Shin, Jiyu Sun, Minsun Jung, Cheol Lee, Hyojin Kim, Jin-Haeng Chung, Kyung Chul Moon, et al. Derivation of prognostic contextual histopathological features from whole-slide images of tumours via graph deep learning.Nature Biomedical Engineering, 6 (12):1452–1466, 2022

  35. [35]

    Preemptively pruning clever-hans strate- gies in deep neural networks.Information Fusion, 103:102094, 2024

    Lorenz Linhardt, Klaus-Robert Müller, and Grégoire Montavon. Preemptively pruning clever-hans strate- gies in deep neural networks.Information Fusion, 103:102094, 2024

  36. [36]

    Computational pathology: a path ahead

    David N Louis, Michael Feldman, Alexis B Carter, Anand S Dighe, John D Pfeifer, Lynn Bry, Jonas S Almeida, Joel Saltz, Jonathan Braun, John E Tomaszewski, et al. Computational pathology: a path ahead. Archives of pathology & laboratory medicine, 140(1):41–50, 2016

  37. [37]

    A framework for multiple-instance learning.Advances in neural information processing systems, 10, 1997

    Oded Maron and Tomás Lozano-Pérez. A framework for multiple-instance learning.Advances in neural information processing systems, 10, 1997

  38. [38]

    Head and neck cancer—part 2: Treatment and prog- nostic factors.Bmj, 341, 2010

    H Mehanna, CML West, C Nutting, and V Paleri. Head and neck cancer—part 2: Treatment and prog- nostic factors.Bmj, 341, 2010

  39. [39]

    Prognostic fac- tors in head and neck cancer: a retrospective cohort study of 3052 patients in brazil.Oral Diseases, 31 (4):1133–1139, 2025

    Ana Letícia Mores, Carolina Guimarães Bonfim-Alves, Rossana Verónica Mendoza López, Leticia Rodrigues-Oliveira, Natália Rangel Palmier, Bruno Augusto Linhares Almeida Mariz, Cesar Augusto Migliorati, Luiz Paulo Kowalski, Alan Roger Santos-Silva, Thaís Bianca Brandão, et al. Prognostic fac- tors in head and neck cancer: a retrospective cohort study of 3052...

  40. [40]

    Comprehensive genomic characterization of head and neck squa- mous cell carcinomas.Nature, 517(7536):576, 2015

    Cancer Genome Atlas Network et al. Comprehensive genomic characterization of head and neck squa- mous cell carcinomas.Nature, 517(7536):576, 2015

  41. [41]

    Protomil: Multiple instance learning with prototypical parts for whole-slide image classifica- tion

    Dawid Rymarczyk, Adam Pardyl, Jarosław Kraus, Aneta Kaczy ´nska, Marek Skomorowski, and Bartosz Zieli´nski. Protomil: Multiple instance learning with prototypical parts for whole-slide image classifica- tion. InJoint European conference on machine learning and knowledge discovery in databases, pages 421–436. Springer, 2023

  42. [42]

    Anders, and Klaus-Robert Müller

    Wojciech Samek, Grégoire Montavon, Sebastian Lapuschkin, Christopher J. Anders, and Klaus-Robert Müller. Explaining deep neural networks and beyond: A review of methods and applications.Proceedings of the IEEE, 109(3):247–278, 2021. doi: 10.1109/JPROC.2021.3060483. 12

  43. [43]

    Schütt, Klaus-Robert Müller, and Grégoire Montavon

    Thomas Schnake, Oliver Eberle, Jonas Lederer, Shinichi Nakajima, Kristof T. Schütt, Klaus-Robert Müller, and Grégoire Montavon. Higher-order explanations of graph neural networks via relevant walks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(11):7581–7596, 2022

  44. [44]

    Towards symbolic xai—explanation through human understandable logical relationships between features.Information Fusion, 118:102923, 2025

    Thomas Schnake, Farnoush Rezaei Jafari, Jonas Lederer, Ping Xiong, Shinichi Nakajima, Stefan Gu- gler, Grégoire Montavon, and Klaus-Robert Müller. Towards symbolic xai—explanation through human understandable logical relationships between features.Information Fusion, 118:102923, 2025

  45. [45]

    Transmil: Trans- former based correlated multiple instance learning for whole slide image classification.Advances in neural information processing systems, 34:2136–2147, 2021

    Zhuchen Shao, Hao Bian, Yang Chen, Yifeng Wang, Jian Zhang, Xiangyang Ji, et al. Transmil: Trans- former based correlated multiple instance learning for whole slide image classification.Advances in neural information processing systems, 34:2136–2147, 2021

  46. [46]

    Jun Shi, Dongdong Sun, Kun Wu, Zhiguo Jiang, Xue Kong, Wei Wang, Haibo Wu, and Yushan Zheng. Positional encoding-guided transformer-based multiple instance learning for histopathology whole slide images classification.Computer Methods and Programs in Biomedicine, 258:108491, 2025

  47. [47]

    Label-free concept based multiple instance learning for gigapixel histopathology.arXiv preprint arXiv:2501.02922, 2025

    Susu Sun, Leslie Tessier, Frédérique Meeuwsen, Clément Grisi, Dominique van Midden, Geert Lit- jens, and Christian F Baumgartner. Label-free concept based multiple instance learning for gigapixel histopathology.arXiv preprint arXiv:2501.02922, 2025

  48. [48]

    Axiomatic attribution for deep networks

    Mukund Sundararajan, Ankur Taly, and Qiqi Yan. Axiomatic attribution for deep networks. InInterna- tional conference on machine learning, pages 3319–3328. PMLR, 2017

  49. [49]

    Effi- cient computation of higher-order subgraph attribution via message passing

    Ping Xiong, Thomas Schnake, Grégoire Montavon, Klaus-Robert Müller, and Shinichi Nakajima. Effi- cient computation of higher-order subgraph attribution via message passing. InProceedings of the 39th International Conference on Machine Learning, pages 24478–24495, 2022

  50. [50]

    Multi- level multiple instance learning with transformer for whole slide image classification.arXiv preprint arXiv:2306.05029, 2023

    Ruijie Zhang, Qiaozhe Zhang, Yingzhuang Liu, Hao Xin, Yan Liu, and Xinggang Wang. Multi- level multiple instance learning with transformer for whole slide image classification.arXiv preprint arXiv:2306.05029, 2023

  51. [51]

    Virchow2: Scaling self- supervised mixed magnification models in pathology.arXiv preprint arXiv:2408.00738, 2024

    Eric Zimmermann, Eugene V orontsov, Julian Viret, Adam Casson, Michal Zelechowski, George Shaikovski, Neil Tenenholtz, James Hall, David Klimstra, Razik Yousfi, et al. Virchow2: Scaling self- supervised mixed magnification models in pathology.arXiv preprint arXiv:2408.00738, 2024. 13 Appendix A Query Space We construct the representative query spaceQby en...