GRALIS: A Unified Canonical Framework for Linear Attribution Methods via Riesz Representation

Raimondo Fanale

arxiv: 2605.05480 · v2 · pith:677P27A3new · submitted 2026-05-06 · 💻 cs.LG · cs.AI· stat.ML

GRALIS: A Unified Canonical Framework for Linear Attribution Methods via Riesz Representation

Raimondo Fanale This is my paper

Pith reviewed 2026-05-20 23:09 UTC · model grok-4.3

classification 💻 cs.LG cs.AIstat.ML

keywords attribution methodsexplainable AIRiesz representationSHAPintegrated gradientslinear functionalscompleteness axiominteraction values

0 comments

The pith

Every additive linear continuous attribution functional on L2 admits a unique canonical representation via the Riesz theorem.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that any attribution method which adds up contributions, scales linearly with input changes, and behaves continuously on the space of square-integrable functions can be written in exactly one standard form consisting of a query set, a weight function, and a delta term. This form is forced by the Riesz representation theorem for linear functionals on Hilbert space. A reader would care because the same form recovers SHAP, integrated gradients, LIME, and linearized GradCAM while automatically supplying completeness, interaction values, and convergence rates that individual methods usually prove only in isolation. The framework therefore supplies a common language in which the guarantees of several popular techniques become theorems rather than separate claims.

Core claim

Every additive, linear, and continuous attribution functional on L^2(Q, mu) admits a unique canonical representation (Q, w, Delta), proved necessary by the Riesz Representation Theorem. This class includes SHAP, IG, LIME and linearized GradCAM. The representation supplies seven simultaneous guarantees: necessary canonical form, exact completeness, Monte Carlo convergence of order O(1/sqrt(m)) plus O(1/k), exact Shapley interaction values, Hoeffding ANOVA decomposition, Sobol sensitivity generalization, and a multi-scale extension with minimum-variance weights.

What carries the argument

The Riesz representation of a continuous linear functional on the Hilbert space L^2(Q, mu), which produces the unique triple (Q, w, Delta) that encodes any qualifying attribution method.

If this is right

All qualifying methods automatically satisfy completeness because the representation is exact.
Monte Carlo sampling of the representation converges at the stated rate without additional assumptions.
Shapley interaction values and Hoeffding ANOVA terms become direct corollaries rather than separate derivations.
The multi-scale extension yields minimum-variance weights while preserving the other six properties.
The same representation satisfies 13.5 out of 14 listed axiomatic properties at once.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

New attribution procedures could be built by choosing different (Q, w, Delta) triples that still obey the linearity and continuity conditions.
Nonlinear methods such as standard GradCAM might be analyzed by projecting them onto the nearest linear functional inside the same space.
The algebraic correspondence with Shapley values via the Möbius transform suggests similar reductions could exist for other cooperative-game concepts in machine learning.
Empirical faithfulness metrics on histology images could be re-derived directly from the canonical weights rather than from method-specific heuristics.

Load-bearing premise

The attribution methods being studied must be additive, linear, and continuous functionals on an L^2 function space.

What would settle it

An explicit construction of an attribution procedure that is additive, linear, and continuous yet cannot be expressed as any triple (Q, w, Delta) on the given L^2 space.

Figures

Figures reproduced from arXiv: 2605.05480 by Raimondo Fanale.

**Figure 1.** Figure 1: Conditioned integration paths in GRALIS ( view at source ↗

**Figure 2.** Figure 2: GRALIS-MC pseudocode. The random permutation samples the view at source ↗

**Figure 2.** Figure 2: GRALIS-MC pseudocode. The random permutation samples the [PITH_FULL_IMAGE:figures/full_fig_p013_2.png] view at source ↗

**Figure 3.** Figure 3: visualizes the reducibility hierarchy. GRALIS (canonical form) KernelSHAP Integrated Gradients LIME MS-GRALIS SHAP GradCAM πx ≡1, ∇F ≈ const. S =F \{i}, πx→δx lin. surrogate on F L levels uniform kernel 1 level cf. Tab. 5 6/14 5.5/14 view at source ↗

**Figure 3.** Figure 3: Reducibility hierarchy of GRALIS. Each arrow is a specialization (limit or special case). Scores refer to [PITH_FULL_IMAGE:figures/full_fig_p017_3.png] view at source ↗

read the original abstract

The main XAI attribution methods for deep neural networks -- GradCAM, SHAP, LIME, Integrated Gradients -- operate on separate theoretical foundations and are not formally comparable. We present GRALIS (Gradient-Riesz Averaged Locally-Integrated Shapley), a mathematical framework establishing a representation theory for attributions: every additive, linear, and continuous attribution functional on L^2(Q,mu) admits a unique canonical representation (Q, w, Delta), proved necessary by the Riesz Representation Theorem. This class encompasses SHAP, IG, LIME and linearized GradCAM, but excludes nonlinear functionals such as standard GradCAM or attention maps. Seven formal theorems provide simultaneous guarantees absent in any individual method: (T1) necessary canonical form; (T2) exact completeness; (T3) Monte Carlo convergence O(1/sqrt(m))+O(1/k); (T4) exact Shapley Interaction Values; (T5) Hoeffding ANOVA decomposition; (T6) Sobol sensitivity generalization; (T7) multi-scale extension (MS-GRALIS) with minimum-variance weights. An algebraic appendix justifies the GRALIS-SIV correspondence via the Mobius transform without circularity. GRALIS satisfies 13.5/14 axiomatic properties vs. 2.5-6/14 for individual methods, including completeness, sensitivity, locality, order-k interactions and optimal multi-scale aggregation simultaneously. Preliminary validation on BreaKHis (1,187 histology images, DenseNet-121) reports deletion faithfulness AUC +0.015 (malignant), 96% class-conditional consistency, SAL = 0.762+/-0.109 and sparsity index 0.39. Extended comparison with baseline XAI methods is planned for a companion paper.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

GRALIS applies Riesz to give a canonical form for linear attributions covering SHAP, IG, LIME and linearized GradCAM, but the global L2 linearity assumption looks shaky for how those methods are actually defined.

read the letter

The main thing here is that the paper uses the Riesz Representation Theorem to derive a necessary canonical representation for any additive linear continuous functional on L2(Q, mu), and shows that SHAP, Integrated Gradients, LIME, and a linearized GradCAM all fit inside that class while nonlinear ones like standard GradCAM do not. Seven theorems then supply joint guarantees on completeness, interaction values, convergence, ANOVA decomposition, Sobol indices, and multi-scale weighting that no single prior method had at once. The algebraic appendix linking to Shapley interaction values via the Möbius transform is a clean way to keep the correspondence non-circular if the details check out. That is the actual advance: a single representation theory that lets the methods be compared and extended under the same axioms, with a reported 13.5/14 axiomatic score versus much lower numbers for the baselines. The setup is formally grounded enough to be worth taking seriously on its own terms. The empirical section is thin, limited to preliminary numbers on one 1,187-image BreaKHis histology set with DenseNet-121, showing only a small faithfulness AUC lift and some consistency and sparsity metrics. That is not yet strong evidence for the broader claims. The bigger soft spot is the domain question raised in the stress-test note. Standard SHAP and IG constructions are local or path-specific for a fixed model and point; treating them as linear maps defined on the entire L2 space of arbitrary square-integrable functions requires an explicit extension that the abstract does not spell out. If the full proofs do not supply that construction, the direct appeal to RRT for uniqueness may not go through without additional restrictions on the function class. This work is aimed at XAI researchers who care about axiomatic unification and formal comparison rather than immediate practical tools. A reader looking for ways to handle interactions and multi-scale aggregation together under one set of theorems would find material worth reading. It has enough formal structure and new representation results to deserve a serious referee, even though the experiments will need expansion and the domain of the functionals will need careful verification.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces GRALIS as a unified canonical framework for linear attribution methods in XAI. It claims that every additive, linear, and continuous attribution functional on the Hilbert space L^2(Q, μ) admits a unique representation (Q, w, Δ) by the Riesz Representation Theorem. This is asserted to encompass SHAP, Integrated Gradients, LIME, and linearized GradCAM (but exclude nonlinear methods such as standard GradCAM). Seven theorems are stated that simultaneously guarantee canonical form, exact completeness, Monte Carlo convergence, exact Shapley interaction values via Möbius transform, Hoeffding ANOVA decomposition, Sobol sensitivity generalization, and a multi-scale extension (MS-GRALIS) with minimum-variance weights. The framework is reported to satisfy 13.5/14 axiomatic properties and is supported by preliminary numerical results on the BreaKHis dataset (1,187 images, DenseNet-121).

Significance. If the linearity assumption and the seven theorems can be rigorously established, the work would supply a common representation theory that permits formal comparison and principled extension of attribution techniques, addressing a recognized fragmentation in the XAI literature. The algebraic appendix justifying the GRALIS-SIV correspondence without circularity and the simultaneous satisfaction of completeness, sensitivity, locality, order-k interactions, and optimal multi-scale aggregation constitute genuine strengths. The preliminary empirical numbers (deletion AUC +0.015, 96% class-conditional consistency, SAL 0.762±0.109) are consistent with the theoretical claims but remain too limited to assess practical impact.

major comments (2)

[Abstract and introduction] Abstract and central claim: the invocation of the Riesz Representation Theorem to obtain a unique canonical form (Q, w, Δ) presupposes that the attribution functionals are well-defined, additive, linear, and continuous on the entire space L^2(Q, μ). Standard constructions of SHAP (coalition expectations), IG (path integrals), and LIME (local linear fits) are instance-specific and operate on perturbations or expectations around a fixed point x for a fixed predictor f. Extending these to arbitrary elements of L^2(Q, μ), including functions unrelated to any given model, is not automatic and requires an explicit construction or restriction of the domain; without it the direct application of RRT does not follow.
[Abstract and § on theorems] Theorems T1–T7 and validation: the abstract asserts seven formal theorems and a 13.5/14 axiomatic score, yet only preliminary numerical results on a single dataset (BreaKHis, 1,187 images) are supplied. Full proofs or at least detailed derivations for the convergence rate in T3, the multi-scale minimum-variance weights in T7, and the algebraic Möbius-transform justification in the appendix are needed to substantiate the simultaneous guarantees that are claimed to be absent from individual methods.

minor comments (2)

The canonical triple (Q, w, Δ) is introduced without an early, self-contained definition of each component; a short paragraph or diagram clarifying their roles would improve readability.
[Empirical evaluation] The reported deletion-faithfulness AUC improvement (+0.015) and sparsity index (0.39) are given without a side-by-side table against baseline methods; although a companion paper is planned, a minimal comparative table in the current manuscript would strengthen the empirical section.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed comments on our manuscript. We address each major comment point by point below, indicating where revisions will be made to clarify the framework and strengthen the presentation of the theorems.

read point-by-point responses

Referee: [Abstract and introduction] Abstract and central claim: the invocation of the Riesz Representation Theorem to obtain a unique canonical form (Q, w, Δ) presupposes that the attribution functionals are well-defined, additive, linear, and continuous on the entire space L^2(Q, μ). Standard constructions of SHAP (coalition expectations), IG (path integrals), and LIME (local linear fits) are instance-specific and operate on perturbations or expectations around a fixed point x for a fixed predictor f. Extending these to arbitrary elements of L^2(Q, μ), including functions unrelated to any given model, is not automatic and requires an explicit construction or restriction of the domain; without it the direct application of RRT does not follow.

Authors: We agree that the standard definitions of SHAP, Integrated Gradients, and LIME are formulated for specific instances around a fixed input x. In GRALIS, we treat attribution as a linear functional on the Hilbert space L^2(Q, μ) and show that each method arises from a particular choice of measure Q, weight function w, and difference operator Δ. To make the application of the Riesz Representation Theorem fully rigorous, we will add a dedicated subsection in the revised manuscript that explicitly constructs the extension of these instance-specific operators to continuous linear functionals on the full space (or a suitable dense subspace of model-relevant functions). This will include the necessary domain restrictions and verify linearity and continuity for each method. revision: yes
Referee: [Abstract and § on theorems] Theorems T1–T7 and validation: the abstract asserts seven formal theorems and a 13.5/14 axiomatic score, yet only preliminary numerical results on a single dataset (BreaKHis, 1,187 images) are supplied. Full proofs or at least detailed derivations for the convergence rate in T3, the multi-scale minimum-variance weights in T7, and the algebraic Möbius-transform justification in the appendix are needed to substantiate the simultaneous guarantees that are claimed to be absent from individual methods.

Authors: We acknowledge that the current manuscript presents the theorem statements and an algebraic appendix but does not contain complete, self-contained proofs for all claims. In the revision we will expand the appendix to include (i) the full derivation of the Monte Carlo convergence rate O(1/√m) + O(1/k) for T3, (ii) the explicit optimization yielding the minimum-variance weights for the multi-scale extension in T7, and (iii) a detailed, non-circular algebraic proof of the GRALIS–SIV correspondence via the Möbius transform. The preliminary numerical results on BreaKHis are intended only as an illustrative validation; as noted in the manuscript, a comprehensive empirical comparison is reserved for a companion paper. revision: yes

Circularity Check

0 steps flagged

No significant circularity: derivation applies standard Riesz theorem to assumed linear functionals

full rationale

The paper's core claim applies the Riesz Representation Theorem (an external, standard result in functional analysis) to the class of additive, linear, and continuous functionals on L^2(Q, mu), yielding a unique canonical representation (Q, w, Delta). This is not self-definitional or a fitted prediction, as the representation follows directly from the theorem once the linearity/continuity premises are granted. The abstract explicitly states that the GRALIS-SIV correspondence is justified algebraically via the Möbius transform in an appendix 'without circularity,' providing independent grounding. No load-bearing self-citations, uniqueness theorems imported from the authors' prior work, or ansatzes smuggled via citation are present in the provided text. The framework shows that SHAP, IG, LIME, and linearized GradCAM fit the linear class but does not reduce any theorem or prediction to a tautological fit or renaming of inputs. The derivation is therefore self-contained against external mathematical benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The framework rests on the standard Riesz Representation Theorem and introduces a new canonical representation; no free parameters or additional invented entities with independent evidence are described in the abstract.

axioms (1)

standard math Riesz Representation Theorem applies to additive, linear, continuous functionals on L^2(Q, mu)
Invoked to establish the necessary unique canonical representation (Q, w, Delta) for the class of attribution methods.

invented entities (1)

Canonical representation (Q, w, Delta) no independent evidence
purpose: To provide a unified form for linear attribution functionals
Defined as the unique output of the Riesz theorem applied to the attribution functional.

pith-pipeline@v0.9.0 · 5861 in / 1444 out tokens · 40819 ms · 2026-05-20T23:09:34.873780+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

every additive, linear, and continuous attribution functional on L²(Q,μ) admits a unique canonical representation (Q, w, Δ), proved necessary by the Riesz Representation Theorem
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

This class encompasses SHAP, IG, LIME and linearized GradCAM

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

30 extracted references · 30 canonical work pages · 1 internal anchor

[1]

Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. (2017). Grad-CAM: Visual explanations from deep networks via gradient-based localization. ICCV, 618–626

work page 2017
[2]

Lundberg, S.M., & Lee, S.-I. (2017). A unified approach to interpreting model predic- tions.NeurIPS 30

work page 2017
[3]

Why should I trust you?

Ribeiro, M.T., Singh, S., & Guestrin, C. (2016). “Why should I trust you?”: Explaining the predictions of any classifier.KDD, 1135–1144

work page 2016
[4]

Sundararajan, M., Taly, A., & Yan, Q. (2017). Axiomatic attribution for deep networks. ICML, 3319–3328

work page 2017
[5]

Ancona, M., Ceolini, E., Öztireli, C., & Gross, M. (2018). Towards better understanding of gradient-based attribution methods for deep neural networks.ICLR. 25

work page 2018
[6]

Montavon, G., Lapuschkin, S., Binder, A., Müller, K.-R., & Samek, W. (2017). Explain- ing nonlinear classification decisions with deep Taylor decomposition.Pattern Recogni- tion, 65, 211–222

work page 2017
[7]

Simonyan, K., Vedaldi, A., & Zisserman, A. (2013). Deep inside convolutional net- works: Visualising image classification models and saliency maps.arXiv preprint arXiv:1312.6034

work page internal anchor Pith review Pith/arXiv arXiv 2013
[8]

Chattopadhyay, A., Sarkar, A., Howlader, P., & Balasubramanian, V.N. (2018). Grad- CAM++: Generalized gradient-based visual explanations for deep convolutional net- works.WACV, 839–847

work page 2018
[9]

Covert, I., & Lee, S.-I. (2021). Improving KernelSHAP: Practical Shapley value estima- tion using linear regression.AISTATS

work page 2021
[10]

Lundstrom, D., Jain, T., & Koyejo, S. (2022). A rigorous study of integrated gradi- ents method and extensions to internal neuron attributions.Transactions on Machine Learning Research (TMLR)

work page 2022
[11]

Kindermans, P.-J., Hooker, S., Adebayo, J., Alber, M., Schütt, K.T., Dähne, S., Erhan, D., & Kim, B. (2019). The (un)reliability of saliency methods. InExplainability of AI, Springer LNCS, pp. 267–280

work page 2019
[12]

Hooker, S., Erhan, D., Kindermans, P.-J., & Kim, B. (2019). A benchmark for inter- pretability methods in deep neural networks.NeurIPS 32

work page 2019
[13]

Wang, H., Wang, Z., Du, M., Yang, F., Zhang, Z., Ding, S., Mardziel, P., & Hu, X. (2020). Score-CAM: Score-weighted visual explanations for convolutional neural net- works.CVPR Workshops

work page 2020
[14]

Fu, R., Hu, Q., Dong, X., Guo, Y., Gao, Y., & Li, B. (2020). Axiom-based Grad-CAM: Towards accurate visualization and explanation of CNNs.BMVC

work page 2020
[15]

Draelos, R.L., & Carin, L. (2021). Use HiResCAM instead of Grad-CAM for faithful explanations of convolutional neural networks.arXiv preprint arXiv:2011.08891

work page arXiv 2021
[16]

Petsiuk, V., Das, A., & Saenko, K. (2018). RISE: Randomized input sampling for ex- planation of black-box models.BMVC

work page 2018
[17]

Rong, Y., Leemann, T., Nguyen, T.-N., Zeitler, L., Jyothiprakash, P., Bhatt, U., Kasneci, E., & Kasneci, G. (2022). Evaluating the faithfulness of saliency-based explanations via the ROAD benchmark.arXiv preprint arXiv:2202.00449

work page arXiv 2022
[18]

Bhatt, U., Weller, A., &Moura, J.M.F.(2020).Evaluatingandaggregatingfeature-based model explanations.IJCAI, 3016–3022

work page 2020
[19]

Grabisch, M., & Roubens, M. (1999). An axiomatic approach to the concept of interac- tion among players in cooperative games.International Journal of Game Theory, 28(4), 547–565

work page 1999
[20]

Hoeffding, W. (1948). A class of statistics with asymptotically normal distribution.An- nals of Mathematical Statistics, 19(3), 293–325

work page 1948
[21]

Efron, B., & Stein, C. (1981). The jackknife estimate of variance.Annals of Statistics, 9(3), 586–596

work page 1981
[22]

Owen, A. B. (2014). Sobol’ indices and Shapley value.SIAM/ASA Journal on Uncer- tainty Quantification, 2(1), 245–251.doi:10.1137/130936233 26

work page doi:10.1137/130936233 2014
[23]

Sobol’, I.M. (1993). Sensitivity estimates for nonlinear mathematical models.Mathemat- ical Modelling and Computational Experiments, 1(4), 407–414

work page 1993
[24]

Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., & Süsstrunk, S. (2012). SLIC su- perpixels compared to state-of-the-art superpixel methods.IEEE TPAMI, 34(11), 2274– 2282

work page 2012
[25]

A., Oliveira, L

Spanhol, F. A., Oliveira, L. S., Petitjean, C., & Heutte, L. (2016). A dataset for breast cancer histological image classification.IEEE Transactions on Biomedical Engineering, 63(7), 1455–1462

work page 2016
[26]

Riesz, F. (1909). Sur les opérations fonctionnelles linéaires.Comptes Rendus de l’Académie des Sciences, 149, 974–977

work page 1909
[27]

(1991).Functional Analysis(2nd ed.)

Rudin, W. (1991).Functional Analysis(2nd ed.). McGraw-Hill. [Theorem 4.12: Riesz representation on Hilbert spaces.]

work page 1991
[28]

Fanale, R., Martini, B., Sciarrone, F., & Caldelli, R. (2026). Explainable ar- tificial intelligence for the analysis of histopathological images of breast cancer: Methods, interpretability and emerging directions.Frontiers in Signal Processing. doi:10.3389/frsip.2026.1795809

work page doi:10.3389/frsip.2026.1795809 2026
[29]

Fanale, R., Caldelli, R., Martini, B., & Sciarrone, F. (2026). A quantitative frame- work for explainability assessment in medical imaging.Frontiers in Imaging, 5, 1846414. doi:10.3389/fimag.2026.1846414Transparency note: this work shares authorship with the present paper; results involving ExpiScore should be interpreted with this in mind

work page doi:10.3389/fimag.2026.1846414transparency 2026
[30]

Fanale, R. (2026). GRALIS-Report: Auditable region-level attribution and structured clinical report generation for breast cancer histology. Manuscript submitted toFrontiers in Signal Processing. 27

work page 2026

[1] [1]

Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. (2017). Grad-CAM: Visual explanations from deep networks via gradient-based localization. ICCV, 618–626

work page 2017

[2] [2]

Lundberg, S.M., & Lee, S.-I. (2017). A unified approach to interpreting model predic- tions.NeurIPS 30

work page 2017

[3] [3]

Why should I trust you?

Ribeiro, M.T., Singh, S., & Guestrin, C. (2016). “Why should I trust you?”: Explaining the predictions of any classifier.KDD, 1135–1144

work page 2016

[4] [4]

Sundararajan, M., Taly, A., & Yan, Q. (2017). Axiomatic attribution for deep networks. ICML, 3319–3328

work page 2017

[5] [5]

Ancona, M., Ceolini, E., Öztireli, C., & Gross, M. (2018). Towards better understanding of gradient-based attribution methods for deep neural networks.ICLR. 25

work page 2018

[6] [6]

Montavon, G., Lapuschkin, S., Binder, A., Müller, K.-R., & Samek, W. (2017). Explain- ing nonlinear classification decisions with deep Taylor decomposition.Pattern Recogni- tion, 65, 211–222

work page 2017

[7] [7]

Simonyan, K., Vedaldi, A., & Zisserman, A. (2013). Deep inside convolutional net- works: Visualising image classification models and saliency maps.arXiv preprint arXiv:1312.6034

work page internal anchor Pith review Pith/arXiv arXiv 2013

[8] [8]

Chattopadhyay, A., Sarkar, A., Howlader, P., & Balasubramanian, V.N. (2018). Grad- CAM++: Generalized gradient-based visual explanations for deep convolutional net- works.WACV, 839–847

work page 2018

[9] [9]

Covert, I., & Lee, S.-I. (2021). Improving KernelSHAP: Practical Shapley value estima- tion using linear regression.AISTATS

work page 2021

[10] [10]

Lundstrom, D., Jain, T., & Koyejo, S. (2022). A rigorous study of integrated gradi- ents method and extensions to internal neuron attributions.Transactions on Machine Learning Research (TMLR)

work page 2022

[11] [11]

Kindermans, P.-J., Hooker, S., Adebayo, J., Alber, M., Schütt, K.T., Dähne, S., Erhan, D., & Kim, B. (2019). The (un)reliability of saliency methods. InExplainability of AI, Springer LNCS, pp. 267–280

work page 2019

[12] [12]

Hooker, S., Erhan, D., Kindermans, P.-J., & Kim, B. (2019). A benchmark for inter- pretability methods in deep neural networks.NeurIPS 32

work page 2019

[13] [13]

Wang, H., Wang, Z., Du, M., Yang, F., Zhang, Z., Ding, S., Mardziel, P., & Hu, X. (2020). Score-CAM: Score-weighted visual explanations for convolutional neural net- works.CVPR Workshops

work page 2020

[14] [14]

Fu, R., Hu, Q., Dong, X., Guo, Y., Gao, Y., & Li, B. (2020). Axiom-based Grad-CAM: Towards accurate visualization and explanation of CNNs.BMVC

work page 2020

[15] [15]

Draelos, R.L., & Carin, L. (2021). Use HiResCAM instead of Grad-CAM for faithful explanations of convolutional neural networks.arXiv preprint arXiv:2011.08891

work page arXiv 2021

[16] [16]

Petsiuk, V., Das, A., & Saenko, K. (2018). RISE: Randomized input sampling for ex- planation of black-box models.BMVC

work page 2018

[17] [17]

Rong, Y., Leemann, T., Nguyen, T.-N., Zeitler, L., Jyothiprakash, P., Bhatt, U., Kasneci, E., & Kasneci, G. (2022). Evaluating the faithfulness of saliency-based explanations via the ROAD benchmark.arXiv preprint arXiv:2202.00449

work page arXiv 2022

[18] [18]

Bhatt, U., Weller, A., &Moura, J.M.F.(2020).Evaluatingandaggregatingfeature-based model explanations.IJCAI, 3016–3022

work page 2020

[19] [19]

Grabisch, M., & Roubens, M. (1999). An axiomatic approach to the concept of interac- tion among players in cooperative games.International Journal of Game Theory, 28(4), 547–565

work page 1999

[20] [20]

Hoeffding, W. (1948). A class of statistics with asymptotically normal distribution.An- nals of Mathematical Statistics, 19(3), 293–325

work page 1948

[21] [21]

Efron, B., & Stein, C. (1981). The jackknife estimate of variance.Annals of Statistics, 9(3), 586–596

work page 1981

[22] [22]

Owen, A. B. (2014). Sobol’ indices and Shapley value.SIAM/ASA Journal on Uncer- tainty Quantification, 2(1), 245–251.doi:10.1137/130936233 26

work page doi:10.1137/130936233 2014

[23] [23]

Sobol’, I.M. (1993). Sensitivity estimates for nonlinear mathematical models.Mathemat- ical Modelling and Computational Experiments, 1(4), 407–414

work page 1993

[24] [24]

Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., & Süsstrunk, S. (2012). SLIC su- perpixels compared to state-of-the-art superpixel methods.IEEE TPAMI, 34(11), 2274– 2282

work page 2012

[25] [25]

A., Oliveira, L

Spanhol, F. A., Oliveira, L. S., Petitjean, C., & Heutte, L. (2016). A dataset for breast cancer histological image classification.IEEE Transactions on Biomedical Engineering, 63(7), 1455–1462

work page 2016

[26] [26]

Riesz, F. (1909). Sur les opérations fonctionnelles linéaires.Comptes Rendus de l’Académie des Sciences, 149, 974–977

work page 1909

[27] [27]

(1991).Functional Analysis(2nd ed.)

Rudin, W. (1991).Functional Analysis(2nd ed.). McGraw-Hill. [Theorem 4.12: Riesz representation on Hilbert spaces.]

work page 1991

[28] [28]

Fanale, R., Martini, B., Sciarrone, F., & Caldelli, R. (2026). Explainable ar- tificial intelligence for the analysis of histopathological images of breast cancer: Methods, interpretability and emerging directions.Frontiers in Signal Processing. doi:10.3389/frsip.2026.1795809

work page doi:10.3389/frsip.2026.1795809 2026

[29] [29]

Fanale, R., Caldelli, R., Martini, B., & Sciarrone, F. (2026). A quantitative frame- work for explainability assessment in medical imaging.Frontiers in Imaging, 5, 1846414. doi:10.3389/fimag.2026.1846414Transparency note: this work shares authorship with the present paper; results involving ExpiScore should be interpreted with this in mind

work page doi:10.3389/fimag.2026.1846414transparency 2026

[30] [30]

Fanale, R. (2026). GRALIS-Report: Auditable region-level attribution and structured clinical report generation for breast cancer histology. Manuscript submitted toFrontiers in Signal Processing. 27

work page 2026