arxiv: 2604.22662 · v1 · submitted 2026-04-24 · 💻 cs.LG · cs.AI· cs.HC

Recognition: unknown

Rethinking XAI Evaluation: A Human-Centered Audit of Shapley Benchmarks in High-Stakes Settings

In\^es Oliveira e Silva , S\'ergio Jesus , Iker Perez , Rita P. Ribeiro , Carlos Soares , Hugo Ferreira , Pedro Bizarro

Authors on Pith no claims yet

Pith reviewed 2026-05-08 12:16 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.HC

keywords Shapley valuesexplainable AIhuman-centered evaluationautomation biashigh-stakes decision makingfraud detectionrisk assessmentoperational workflows

0 comments

The pith

Standard quantitative metrics for Shapley explanations do not predict human clarity or decision quality in high-stakes settings.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper compares eight Shapley value formulations for AI explanations using a single framework that holds computational constraints fixed. It tests them on four risk datasets plus a live fraud-detection workflow where professional analysts reviewed 3,735 cases. Standard scores such as sparsity and faithfulness turn out to be unrelated to whether analysts find the explanations clear or useful for their actual choices. No version of the explanations raised the analysts' objective accuracy, yet every version raised their confidence in the decisions they reached.

Core claim

A unified amortized framework isolates semantic differences among Shapley variants; across risk datasets and 3,735 professional analyst reviews, quantitative proxies such as sparsity and faithfulness prove decoupled from perceived clarity and decision utility, while explanations raise analyst confidence without improving objective performance and thereby create a measurable risk of automation bias.

What carries the argument

The unified amortized framework that isolates semantic differences between eight Shapley variants while respecting low-latency operational constraints.

If this is right

Current quantitative proxies are insufficient to predict the downstream human impact of an explanation method.
Explanations can increase decision confidence without raising objective accuracy, creating a documented risk of automation bias.
Selection of Shapley formulations and evaluation metrics for operational systems requires evidence tied to human outcomes rather than proxy scores alone.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

New evaluation protocols may need direct measures of decision utility collected from domain experts rather than post-hoc proxy calculations.
Similar human audits in medicine or autonomous systems could uncover parallel over patterns.
Interface designs that surface uncertainty information alongside explanations might reduce the observed confidence inflation.

Load-bearing premise

The 3,735 analyst case reviews provide a representative proxy for real high-stakes operational risk workflows without major selection or contextual biases.

What would settle it

A controlled follow-up deployment in which analysts given Shapley explanations show measurably higher decision accuracy than a no-explanation control group on the same live risk cases.

Figures

Figures reproduced from arXiv: 2604.22662 by Carlos Soares, Hugo Ferreira, Iker Perez, In\^es Oliveira e Silva, Pedro Bizarro, Rita P. Ribeiro, S\'ergio Jesus.

**Figure 1.** Figure 1: Overview of the empirical XAI audit in this paper. Left: Quantitative benchmarks reveal systematic functional variation view at source ↗

**Figure 2.** Figure 2: The case review interface integrates model scores, Shapley attributions, and contextual data summaries while holding view at source ↗

**Figure 3.** Figure 3: Effect sizes on human outcomes, with 95% confidence intervals, across accuracy, decision time (𝑝97.5), confidence and clarity. X-axis is in log-scale view at source ↗

**Figure 4.** Figure 4: Effect of quantitative explanation metrics on clarity view at source ↗

**Figure 5.** Figure 5: A screenshot showing the initial onboarding form with dropdowns for expertise levels. view at source ↗

**Figure 6.** Figure 6: Task summary panel providing analysts with view at source ↗

read the original abstract

Shapley values are a cornerstone of explainable AI, yet their proliferation into competing formulations has created a fragmented landscape with little consensus on practical deployment. While theoretical differences are well-documented, evaluation remains reliant on quantitative proxies whose alignment with human utility is unverified. In this work, we use a unified amortized framework to isolate semantic differences between eight Shapley variants under the low-latency constraints of operational risk workflows. We conduct a large-scale empirical evaluation across four risk datasets and a realistic fraud-detection environment involving professional analysts and 3,735 case reviews. Our results reveal a fundamental misalignment: standard quantitative metrics, such as sparsity and faithfulness, are decoupled from human-perceived clarity and decision utility. Furthermore, while no formulation improved objective analyst performance, explanations consistently increased decision confidence, signaling a critical risk of automation bias in high-stakes settings. These findings suggest that current evaluation proxies are insufficient for predicting downstream human impact, and we provide evidence-based guidance for selecting formulations and metrics in operational decision systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The paper claims that standard quantitative metrics for evaluating Shapley value explanations (e.g., sparsity and faithfulness) are decoupled from human-perceived clarity and decision utility in high-stakes settings. Using a unified amortized framework to compare eight Shapley variants, the authors conduct a large empirical study across four risk datasets and a fraud-detection environment with professional analysts performing 3,735 case reviews. Key findings are that no formulation improved objective analyst performance but all increased decision confidence (indicating automation bias risk), and that current proxy metrics are insufficient for predicting downstream human impact.

Significance. If the central empirical results hold, the work is significant for XAI research because it supplies direct human-subject evidence that challenges reliance on proxy-based benchmarks for Shapley methods. The scale of the study (professional analysts, realistic operational workflow, multiple datasets) and the explicit demonstration of misalignment between quantitative proxies and human utility provide actionable guidance for deployment in risk-sensitive domains. The reproducible human-study design and falsifiable claims about automation bias are particular strengths.

minor comments (3)

The unified amortized framework is central to isolating semantic differences among the eight variants; a concise table or diagram explicitly mapping each variant's key modeling choices (e.g., reference distribution, coalition sampling) to the observed human outcomes would improve readability.
Section describing the 3,735 case reviews: the manuscript should report effect sizes and confidence intervals alongside p-values for the confidence-increase finding to allow readers to judge the practical magnitude of the automation-bias signal.
The discussion of metric decoupling would benefit from an explicit statement of the correlation coefficients (or lack thereof) between each quantitative proxy and the human clarity/utility scores, rather than qualitative description alone.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive assessment of the work, recognition of its significance for XAI evaluation, and recommendation for minor revision. The emphasis on the scale of the human-subject study with professional analysts and the demonstration of misalignment between proxy metrics and human utility is appreciated. No major comments were enumerated in the report.

Circularity Check

0 steps flagged

No significant circularity in empirical human-subject study

full rationale

This is a purely empirical comparative study that evaluates eight Shapley variants via a unified amortized framework and a controlled experiment with 3,735 professional analyst reviews across four datasets. No mathematical derivation chain, fitted-parameter predictions, or self-referential equations exist that could reduce outputs to inputs by construction. The central claims (misalignment between quantitative proxies and human utility, plus automation-bias signal) rest directly on observed human responses and statistical comparisons, which are externally falsifiable. The framework is presented as a methodological tool with explicit design choices rather than a self-defining ansatz or uniqueness theorem. No load-bearing self-citations or renamings of known results appear in the abstract or described structure. The work is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

No free parameters or invented entities; the work relies on existing Shapley formulations and standard evaluation practices in XAI and human-computer interaction.

axioms (1)

domain assumption Standard assumptions in human-subject experiments such as representative sampling and unbiased analyst responses.
Invoked implicitly in conducting the 3,735 case reviews.

pith-pipeline@v0.9.0 · 5502 in / 1248 out tokens · 54932 ms · 2026-05-08T12:16:37.945057+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

62 extracted references · 4 canonical work pages · 1 internal anchor

[1]

Kjersti Aas, Martin Jullum, and Anders Løland. 2021. Explaining individual predictions when features are dependent: More accurate approximations to Shapley values.Artificial Intelligence298 (2021), 103502

2021
[2]

Marzia Ahmed, Mohammod Abul Kashem, Mostafijur Rahman, and Sabira Khatun. 2020. Review and Analysis of Risk Factor of Maternal Health in Remote Oliveira e Silva et al. Area Using the Internet of Things (IoT). InInECCE2019. Springer Singapore, 357–365

2020
[3]

Takuya Akiba, Shotaro Sano, Toshihiko Yanase, Takeru Ohta, and Masanori Koyama. 2019. Optuna: A Next-generation Hyperparameter Optimization Frame- work. InProceedings of the 25th ACM SIGKDD International Conference on Knowl- edge Discovery & Data Mining. Association for Computing Machinery, 9 pages

2019
[4]

Emanuele Albini, Jason Long, Danial Dervovic, and Daniele Magazzeni. 2022. Counterfactual Shapley Additive Explanations. InProceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency (FAccT ’22). Association for Computing Machinery, 1054–1070

2022
[5]

Alonso- Moral, Roberto Confalonieri, Riccardo Guidotti, Javier Del Ser, Natalia Díaz- Rodríguez, and Francisco Herrera

Sajid Ali, Tamer Abuhmed, Shaker El-Sappagh, Khan Muhammad, Jose M. Alonso- Moral, Roberto Confalonieri, Riccardo Guidotti, Javier Del Ser, Natalia Díaz- Rodríguez, and Francisco Herrera. 2023. Explainable Artificial Intelligence (XAI): What we know and what is left to attain Trustworthy Artificial Intelligence. Information Fusion99 (2023), 101805

2023
[6]

Rodolfa, Sérgio Jesus, Valerie Chen, Vladimir Bal- ayan, Pedro Saleiro, Pedro Bizarro, Ameet Talwalkar, and Rayid Ghani

Kasun Amarasinghe, Kit T. Rodolfa, Sérgio Jesus, Valerie Chen, Vladimir Bal- ayan, Pedro Saleiro, Pedro Bizarro, Ameet Talwalkar, and Rayid Ghani. 2024. On the Importance of Application-Grounded Experimental Design for Evaluat- ing Explainable ML Methods.Proceedings of the AAAI Conference on Artificial Intelligence38 (2024), 20921–20929

2024
[7]

Gagan Bansal, Tongshuang Wu, Joyce Zhou, Raymond Fok, Besmira Nushi, Ece Kamar, Marco Tulio Ribeiro, and Daniel Weld. 2021. Does the whole exceed its parts? the effect of ai explanations on complementary team performance. In Proceedings of the 2021 CHI conference on human factors in computing systems. 1–16

2021
[8]

Umang Bhatt, Alice Xiang, Shubham Sharma, Adrian Weller, Ankur Taly, Yunhan Jia, Joydeep Ghosh, Ruchir Puri, José M. F. Moura, and Peter Eckersley. 2020. Explainable machine learning in deployment. InProceedings of the 2020 Confer- ence on Fairness, Accountability, and Transparency. Association for Computing Machinery, 648–657

2020
[9]

Michael Bücker, Gero Szepannek, Alicja Gosiewska, and Przemyslaw Biecek
[10]

Transparency, auditability, and explainability of machine learning models in credit scoring.Journal of the Operational Research Society73, 1 (2022), 70–90

2022
[11]

Dulce Canha, Sylvain Kubler, Kary Främling, and Guy Fagherazzi. 2025. A Functionally-Grounded Benchmark Framework for XAI Methods: Insights and Foundations from a Systematic Literature Review.ACM Comput. Surv.57, 12, Article 320 (2025), 40 pages

2025
[12]

Covert, Scott M

Hugh Chen, Ian C. Covert, Scott M. Lundberg, and Su-In Lee. 2023. Algorithms to estimate Shapley value feature attributions.Nature Machine Intelligence5, 6 (2023), 590–601

2023
[13]

Hugh Chen, Joseph D Janizek, Scott Lundberg, and Su-In Lee. 2020. True to the model or true to the data?arXiv preprint arXiv:2006.16234(2020)

work page arXiv 2020
[14]

Ian Covert, Chanwoo Kim, Su-In Lee, James Zou, and Tatsunori Hashimoto
[15]

InProceedings of the 38th International Conference on Neural Information Processing Systems (NIPS ’24)

Stochastic amortization: a unified approach to accelerate feature and data attribution. InProceedings of the 38th International Conference on Neural Information Processing Systems (NIPS ’24). Curran Associates Inc., Article 143, 50 pages
[16]

Ian Covert and Su-In Lee. 2021. Improving KernelSHAP: Practical Shapley Value Estimation Using Linear Regression. InProceedings of The 24th International Conference on Artificial Intelligence and Statistics (Proceedings of Machine Learning Research, Vol. 130). PMLR, 3457–3465

2021
[17]

Ian Covert, Scott Lundberg, and Su-In Lee. 2021. Explaining by Removing: A Unified Framework for Model Explanation.Journal of Machine Learning Research 22, 209 (2021), 1–90

2021
[18]

Anupam Datta, Shayak Sen, and Yair Zick. 2016. Algorithmic Transparency via Quantitative Input Influence: Theory and Experiments with Learning Systems. In2016 IEEE Symposium on Security and Privacy (SP). IEEE Computer Society, 598–617

2016
[19]

Frances Ding, Moritz Hardt, John Miller, and Ludwig Schmidt. 2021. Retiring Adult: New Datasets for Fair Machine Learning. InAdvances in Neural Information Processing Systems, Vol. 34. Curran Associates, Inc., 6478–6490

2021
[20]

Finale Doshi-Velez and Been Kim. 2017. Towards A Rigorous Science of Inter- pretable Machine Learning. arXiv:1702.08608 [stat.ML]

work page internal anchor Pith review arXiv 2017
[21]

Joel Dyer, Nicholas Bishop, Yorgos Felekis, Fabio Massimo Zennaro, Anisoara Calinescu, Theodoros Damoulas, and Michael Wooldridge. 2024. Interventionally Consistent Surrogates for Complex Simulation Models. InAdvances in Neural Information Processing Systems, Vol. 37. Curran Associates, Inc., 21814–21841

2024
[22]

2025.FICO Expands Educational Analytics Challenge Program with Three New Historically Black Colleges and Universities to Educate Aspiring Data Scientists

FICO. 2025.FICO Expands Educational Analytics Challenge Program with Three New Historically Black Colleges and Universities to Educate Aspiring Data Scientists. FICO. Accessed: 2026-01-02

2025
[23]

2012.Applied longitudinal analysis

Garrett M Fitzmaurice, Nan M Laird, and James H Ware. 2012.Applied longitudinal analysis. John Wiley & Sons

2012
[24]

Jeremy Goldwasser and Giles Hooker. 2024. Stabilizing Estimates of Shapley Val- ues with Control Variates. InExplainable Artificial Intelligence. Springer Nature Switzerland, 416–439

2024
[25]

Léo Grinsztajn, Edouard Oyallon, and Gaël Varoquaux. 2022. Why do tree-based models still outperform deep learning on typical tabular data?Advances in neural information processing systems35 (2022), 507–520

2022
[26]

Tom Heskes, Evi Sijben, Ioan Gabriel Bucur, and Tom Claassen. 2020. Causal Shapley Values: Exploiting Causal Knowledge to Explain Individual Predictions of Complex Models. InAdvances in Neural Information Processing Systems, Vol. 33. Curran Associates, Inc., 4778–4789

2020
[27]

Hans Hofmann. 1994. Statlog (German Credit Data). UCI Machine Learning Repository

1994
[28]

Dominik Janzing, Lenon Minorics, and Patrick Bloebaum. 2020. Feature relevance quantification in explainable AI: A causal problem. InProceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics (Proceedings of Machine Learning Research, Vol. 108). PMLR, 2907–2916

2020
[29]

Sérgio Jesus, Catarina Belém, Vladimir Balayan, João Bento, Pedro Saleiro, Pedro Bizarro, and João Gama. 2021. How can I choose an explainer? An Application- grounded Evaluation of Post-hoc Explanations. InProceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. Association for Com- puting Machinery, 11 pages

2021
[30]

Jorge, Rita P

Sérgio Jesus, Pedro Saleiro, Inês Oliveira e Silva, Beatriz M. Jorge, Rita P. Ribeiro, João Gama, Pedro Bizarro, and Rayid Ghani. 2024. Aequitas Flow: Streamlining Fair ML Experimentation.Journal of Machine Learning Research25 (2024), 1–7

2024
[31]

Neil Jethani, Mukund Sudarshan, Ian Connick Covert, Su-In Lee, and Rajesh Ranganath. 2022. FastSHAP: Real-Time Shapley Value Estimation. InInternational Conference on Learning Representations

2022
[32]

Amir-Hossein Karimi, Bernhard Schölkopf, and Isabel Valera. 2021. Algorithmic Recourse: from Counterfactual Explanations to Interventions. InProceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (FAccT ’21). Association for Computing Machinery, 353–362

2021
[33]

Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. 2017. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. InAdvances in Neural Information Processing Systems, Vol. 30. Curran Associates, Inc

2017
[34]

AlSaleh, and Amir Mazhar

Farhina Sardar Khan, Syed Shahid Mazhar, Kashif Mazhar, Dhoha A. AlSaleh, and Amir Mazhar. 2025. Model-agnostic explainable artificial intelligence methods in finance: a systematic review, recent developments, limitations, challenges and future directions.Artificial Intelligence Review58, 8 (2025), 232

2025
[35]

Consistent Individualized Feature Attribution for Tree Ensembles

Scott M. Lundberg, Gabriel G. Erion, and Su-In Lee. 2019. Consistent Individual- ized Feature Attribution for Tree Ensembles. arXiv:1802.03888 [cs.LG]

work page Pith review arXiv 2019
[36]

Lundberg and Su-In Lee

Scott M. Lundberg and Su-In Lee. 2017. A unified approach to interpreting model predictions. InProceedings of the 31st International Conference on Neural Information Processing Systems. Curran Associates Inc., 4768–4777

2017
[37]

Luke Merrick and Ankur Taly. 2020. The Explanation Game: Explaining Machine Learning Models Using Shapley Values. InInternational Cross-Domain Confer- ence for Machine Learning and Knowledge Extraction. Springer International Publishing, 17–38

2020
[38]

Lucas Thibaut Meyer, Marc Schouler, Robert Alexander Caulk, Alejandro Ribes, and Bruno Raffin. 2023. Training Deep Surrogate Models with Large Scale Online Learning. InProceedings of the 40th International Conference on Machine Learning, Vol. 202. PMLR, 24614–24630

2023
[39]

Ibomoiye Domor Mienye, George Obaido, Nobert Jere, Ebikella Mienye, Kehinde Aruleba, Ikiomoye Douglas Emmanuel, and Blessing Ogbuokiri. 2024. A survey of explainable artificial intelligence in healthcare: Concepts, applications, and challenges.Informatics in Medicine Unlocked51 (2024), 101587

2024
[40]

Christoph Molnar, Giuseppe Casalicchio, and Bernd Bischl. 2020. Interpretable Machine Learning – A Brief History, State-of-the-Art and Challenges. InECML PKDD 2020 Workshops. Springer International Publishing, 417–431

2020
[41]

Mothilal, Amit Sharma, and Chenhao Tan

Ramaravind K. Mothilal, Amit Sharma, and Chenhao Tan. 2020. Explaining machine learning classifiers through diverse counterfactual explanations. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. Association for Computing Machinery, New York, NY, USA, 11 pages

2020
[42]

Maximilian Muschalik, Fabian Fumagalli, Barbara Hammer, and Eyke Hüller- meier. 2024. Beyond TreeSHAP: efficient computation of any-order shapley interactions for tree ensembles. InProceedings of the Thirty-Eighth AAAI Confer- ence on Artificial Intelligence (AAAI’24). AAAI Press, Article 1604, 9 pages

2024
[43]

Meike Nauta, Jan Trienes, Shreyasi Pathak, Elisa Nguyen, Michelle Peters, Yasmin Schmitt, Jörg Schlötterer, Maurice van Keulen, and Christin Seifert. 2023. From Anecdotal Evidence to Quantitative Evaluation Methods: A Systematic Review on Evaluating Explainable AI.ACM Comput. Surv.55, 13s (2023)

2023
[44]

John Ashworth Nelder and Robert WM Wedderburn. 1972. Generalized linear models.Journal of the Royal Statistical Society Series A: Statistics in Society135 (1972), 370–384

1972
[45]

Lars Henry Berge Olsen and Martin Jullum. 2026. Improving the Weighting Strategy in KernelSHAP. InExplainable Artificial Intelligence. Springer Nature Switzerland, 194–218

2026
[46]

Iker Perez, Piotr Skalski, Alec Barns-Graham, Jason Wong, and David Sutton. 2022. Attribution of predictive uncertainties in classification models. InProceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence (Proceedings of Machine Learning Research, Vol. 180). PMLR, 1582–1591

2022
[47]

2000.Mixed-effects models in S and S-PLUS

José C Pinheiro and Douglas M Bates. 2000.Mixed-effects models in S and S-PLUS. Springer. A Human-Centered Audit of Shapley Value Benchmarks in High-Stakes Workflows

2000
[48]

Forough Poursabzi-Sangdeh, Daniel G Goldstein, Jake M Hofman, Jennifer Wort- man Vaughan, and Hanna Wallach. 2021. Manipulating and measuring model interpretability. InProceedings of the 2021 CHI conference on human factors in computing systems. 1–52

2021
[49]

Cynthia Rudin. 2019. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead.Nature Machine Intelligence1, 5 (2019), 206–215

2019
[50]

Waddah Saeed and Christian Omlin. 2023. Explainable AI (XAI): A systematic meta-survey of current challenges and future opportunities.Knowledge-Based Systems263 (2023), 110273

2023
[51]

Maria Sahakyan, Zeyar Aung, and Talal Rahwan. 2021. Explainable artificial intelligence for tabular data: A survey.IEEE access9 (2021), 135392–135422

2021
[52]

Lloyd S Shapley. 1953. A Value for n-Person Games. InContributions to the Theory of Games II. Princeton University Press, Princeton, 307–317

1953
[53]

Venkatesh Sivaraman, Leigh A Bukowski, Joel Levin, Jeremy M Kahn, and Adam Perer. 2023. Ignore, trust, or negotiate: understanding clinician acceptance of AI-based treatment recommendations in health care. InProceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–18

2023
[54]

Erik Štrumbelj and Igor Kononenko. 2010. An Efficient Explanation of Individual Classifications using Game Theory.Journal of Machine Learning Research11, 1 (2010), 1–18

2010
[55]

Mukund Sundararajan and Amir Najmi. 2020. The Many Shapley Values for Model Explanation. InProceedings of the 37th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 119). PMLR, 9269–9278

2020
[56]

Muhammad Faaiz Taufiq, Patrick Blöbaum, and Lenon Minorics. 2023. Manifold Restricted Interventional Shapley Values. InProceedings of The 26th International Conference on Artificial Intelligence and Statistics (Proceedings of Machine Learning Research, Vol. 206). PMLR, 5079–5106

2023
[57]

Boris Van Breugel and Mihaela Van Der Schaar. 2024. Position: Why Tabular Foundation Models Should Be a Research Priority. InProceedings of the 41st International Conference on Machine Learning, Vol. 235. PMLR, 48976–48993

2024
[58]

Giulia Vilone and Luca Longo. 2021. Notions of explainability and evaluation approaches for explainable artificial intelligence.Information Fusion76 (2021), 89–106

2021
[59]

Oskar Wysocki, Jessica Katharine Davies, Markel Vigo, Anne Caroline Arm- strong, Dónal Landers, Rebecca Lee, and André Freitas. 2023. Assessing the com- munication gap between AI models and healthcare professionals: Explainability, utility and trust in AI-driven clinical decision-making.Artificial Intelligence316 (2023), 103839

2023
[60]

Jilei Yang. 2021. Fast TreeSHAP: Accelerating SHAP Value Computation for Trees.ArXivabs/2109.09847 (2021)

work page arXiv 2021
[61]

Peng Yu, Albert Bifet, Jesse Read, and Chao Xu. 2022. Linear tree shap. In Advances in Neural Information Processing Systems, Vol. 35. Curran Associates, Inc., 25818–25828

2022
[62]

true-to-the-model

Artjom Zern, Klaus Broelemann, and Gjergji Kasneci. 2023. Interventional SHAP Values and Interaction Values for Piecewise Linear Regression Trees.Proceedings of the AAAI Conference on Artificial Intelligence37, 9 (2023), 11164–11173. A Extended Overview of Shapley Value Formulations The definition of the value function𝑣𝒙 (S) hinges on how one repre- sents...

2023