arxiv: 2602.24176 · v4 · submitted 2026-02-27 · 💻 cs.CY

Recognition: no theorem link

Beyond Explainable AI (XAI): An Overdue Paradigm Shift and Post-XAI Research Directions

Saleh Afroogh , Syed Ishtiaque Ahmed , Petra Ahrweiler , David Alvarez-Melis , Mansur Maturidi Arief , Emilia Barakova , Falco J. Bargagli-Stoffi , Erdem Biyik

show 41 more authors

Hanjie Chen Xiang 'Anthony' Chen Robert Alan Clements Keeley Crockett Amit Dhurandhar Fethiye Irmak Dogan Mollie Dollinger Motahhare Eslami Aldo A Faisal Arya Farahi Melanie F. Pradier Saadia Gabriel Diego Garcia-Olano Marzyeh Ghassemi Shaona Ghosh Hatice Gunes Ehsan Hajiramezanali Stefan Haufe Biwei Huang Angel Hwang Md Tauhidul Islam Junfeng Jiao Amir-Hossein Karimi Saber Kazeminasab Anastasia Kuzminykh William La Cava Brian Y. Lim Xiaofeng Liu Mohammad R. K. Mofrad Alicia Parrish Maria Perez-Ortiz Shriti Raj Swabha Swayamdipta Salmonn Talebi Kush R. Varshney Mihaela Vorvoreanu Lily Weng Alice Xiang Yiming Xu Ding Zhao Jieyu Zhao

Authors on Pith no claims yet

Pith reviewed 2026-05-15 18:47 UTC · model grok-4.3

classification 💻 cs.CY

keywords Explainable AIPost-XAIParadigm shiftDeep neural networksLarge language modelsAI verificationInterpretabilityCertified AI

0 comments

The pith

XAI contains deep paradoxes and false assumptions that make incremental fixes counterproductive, requiring a full shift to certified AI approaches.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines limitations in explainable AI for deep neural networks and large language models, tracing symptoms to two paradoxes, two conceptual confusions, and five false assumptions. These lead to the claim that XAI is experimentally flawed, conceptually inconsistent, and that attempts to repair it only deepen the problems. A sympathetic reader would care because current explainability efforts risk producing misleading or confusing outputs instead of reliable understanding. The authors therefore call for a four-pronged replacement: verification-focused interactive AI, AI epistemology, user-sensible AI, and model-centered interpretability. This reorients the field from post-hoc explanations toward scientific certification and context-aware design.

Core claim

The central claim is that current XAI approaches for DNNs and LLMs exhibit significant empirical flaws, rest on conceptual paradoxes, and that further reform efforts would worsen confusion; therefore the field must undertake a four-pronged paradigm shift to verification-focused Interactive AI for community certification protocols, AI Epistemology for rigorous foundations, User-Sensible AI for context-aware tailoring, and Model-Centered Interpretability for faithful technical analysis, together enabling reliable and certified AI development.

What carries the argument

The four-pronged paradigm shift that replaces post-hoc explanation with verification protocols, epistemic foundations, community-specific design, and direct model analysis.

If this is right

AI performance certification would shift from post-hoc explanations to community-established verification protocols.
Research would prioritize building scientific foundations through AI epistemology rather than ad-hoc interpretability techniques.
Systems would be designed as user-sensible from the start, adapting to the needs of specific user communities.
Technical analysis would center on the models themselves for faithful description instead of generating separate explanations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Regulatory frameworks might move toward mandatory verification benchmarks rather than explainability requirements.
User studies in human-AI interaction could test whether the new directions reduce documented confusion compared with current XAI outputs.
Connections to philosophy of science become relevant for validating what counts as certified knowledge in AI systems.

Load-bearing premise

The assumption that the identified paradoxes and false assumptions cannot be resolved by improving existing XAI methods and instead require abandoning the explainability paradigm altogether.

What would settle it

A concrete demonstration that an existing XAI technique can be adjusted to eliminate the stated paradoxes and confusions while still delivering consistent, non-misleading explanations across multiple user studies and model types.

read the original abstract

This study provides a cross-disciplinary examination of Explainable Artificial Intelligence (XAI) approaches-focusing on deep neural networks (DNNs) and large language models (LLMs)-and identifies empirical and conceptual limitations in current XAI. We discuss critical symptoms that stem from deeper root causes (i.e., two paradoxes, two conceptual confusions, and five false assumptions). These fundamental problems within the current XAI research field reveal three insights: experimentally, XAI exhibits significant flaws; conceptually, it is paradoxical; and pragmatically, further attempts to reform the paradoxical XAI might exacerbate its confusion-demanding fundamental shifts and new research directions. To move beyond XAI's limitations, we propose a four-pronged synthesized paradigm shift toward reliable and certified AI development. These four components include: verification-focused Interactive AI (IAI) to establish scientific community protocols for certifying AI system performance rather than attempting post-hoc explanations, AI Epistemology for rigorous scientific foundations, User-Sensible AI to create context-aware systems tailored to specific user communities, and Model-Centered Interpretability for faithful technical analysis-together offering comprehensive post-XAI research directions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This position paper organizes XAI critiques into paradoxes and assumptions but doesn't demonstrate why incremental fixes can't work.

read the letter

The main thing is that this position paper claims XAI is built on paradoxes and false assumptions that make it not worth trying to improve, and instead we should pursue verification-focused interactive AI along with better epistemology and context-aware designs. It does a good job bringing together critiques from different papers into one organized set of issues and then sketching four research directions to move beyond them. The synthesis helps see the pattern in why explanations for complex models often don't work as hoped. The soft spot is that it doesn't demonstrate why those issues can't be addressed by refining current XAI techniques. The text identifies the problems but stops short of showing that any attempt at incremental reform would lead to worse outcomes or contradictions, so the push for a complete paradigm shift feels more like a recommendation than a proven requirement. This paper is for researchers in AI interpretability who are interested in the broader debate about its future. It won't offer new data or proofs, but it could help organize thinking about priorities. It deserves a serious referee because the questions about whether XAI can be salvaged are worth airing out in the literature. I'd recommend sending it for peer review.

Referee Report

2 major / 2 minor

Summary. The manuscript provides a cross-disciplinary examination of XAI approaches for DNNs and LLMs, identifies empirical and conceptual limitations stemming from two paradoxes, two confusions, and five false assumptions, derives three insights (experimental flaws, conceptual paradox, and risk that reform exacerbates confusion), and proposes a four-pronged paradigm shift to post-XAI directions: verification-focused Interactive AI (IAI), AI Epistemology, User-Sensible AI, and Model-Centered Interpretability.

Significance. If the root-cause diagnosis holds and the necessity of abandoning incremental XAI improvements is established, the work could redirect research toward certified, context-aware, and epistemologically grounded AI systems. The proposal of concrete post-XAI components such as IAI and User-Sensible AI offers a structured research agenda, but the manuscript's support remains limited to conceptual assertion without empirical studies, formal derivations, or case analyses demonstrating that targeted fixes must fail.

major comments (2)

[Abstract and section on the three insights] The pragmatic insight that further reform of XAI would exacerbate confusion (stated in the abstract and the section deriving the three insights) is asserted without a concrete demonstration. No argument shows that any incremental change addressing one of the listed paradoxes (e.g., the explanation paradox) necessarily leads to contradiction or performance regression within existing XAI frameworks.
[Section identifying root causes (two paradoxes, two confusions, five false assumptions)] The inference that the two paradoxes, two confusions, and five false assumptions are irresolvable incrementally and therefore require a complete four-pronged shift is not derived. The root-causes section lists these issues but provides no formal conditions or counter-example showing that any fix satisfying those conditions must fail, leaving the necessity claim unsupported.

minor comments (2)

[Proposal of the four-pronged paradigm shift] The newly introduced terms 'verification-focused Interactive AI (IAI)' and 'User-Sensible AI' are presented as distinct components but lack precise operational definitions or explicit contrasts with prior interactive or user-centered AI literature.
[Discussion of experimental flaws] The claim that XAI 'exhibits significant flaws' experimentally would be strengthened by citing specific evaluation studies or benchmarks rather than remaining at the level of general assertion.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and detailed report. The comments highlight opportunities to strengthen the logical derivations in our conceptual analysis. Below we respond point by point to the major comments, clarifying how the manuscript derives its claims from the identified root causes while indicating targeted revisions that will make the necessity argument more explicit without altering the paper's cross-disciplinary scope.

read point-by-point responses

Referee: [Abstract and section on the three insights] The pragmatic insight that further reform of XAI would exacerbate confusion (stated in the abstract and the section deriving the three insights) is asserted without a concrete demonstration. No argument shows that any incremental change addressing one of the listed paradoxes (e.g., the explanation paradox) necessarily leads to contradiction or performance regression within existing XAI frameworks.

Authors: The pragmatic insight follows directly from the explanation paradox as defined in the root-causes section: any post-hoc method that increases local fidelity necessarily reduces human-comprehensible structure because DNN/LLM decision boundaries are high-dimensional and non-linear. The manuscript supports this by tracing how successive XAI refinements (e.g., from LIME to SHAP to attention visualization) have each traded one desideratum for another, producing the documented increase in contradictory claims across the literature. While the current text presents this as a logical consequence rather than a formal proof, we agree that an explicit chain of implications would strengthen the claim. We will therefore insert a short subsection that enumerates three representative incremental proposals, shows the specific contradiction each creates with one of the five false assumptions, and notes the resulting performance or interpretability regression reported in the cited empirical studies. revision: yes
Referee: [Section identifying root causes (two paradoxes, two confusions, five false assumptions)] The inference that the two paradoxes, two confusions, and five false assumptions are irresolvable incrementally and therefore require a complete four-pronged shift is not derived. The root-causes section lists these issues but provides no formal conditions or counter-example showing that any fix satisfying those conditions must fail, leaving the necessity claim unsupported.

Authors: The necessity claim is derived by showing that each root cause violates an invariant property of DNNs and LLMs (opacity, lack of causal semantics, and absence of a shared scientific ontology). Because these invariants are preserved under any post-hoc or architectural patch that leaves the model class unchanged, no incremental fix can simultaneously satisfy all five false assumptions without reintroducing at least one paradox. The manuscript illustrates this through the two confusions (equating correlation with explanation, and treating user mental models as model-agnostic). To make the derivation more transparent, we will add a compact table that maps each false assumption to the invariant it contradicts and to the paradox it re-creates, thereby supplying the explicit conditions the referee requests. revision: yes

Circularity Check

0 steps flagged

No circularity: conceptual critique relies on external literature analysis

full rationale

The paper conducts a cross-disciplinary literature review to identify symptoms in XAI (flaws, paradoxes, confusions, false assumptions) and argues these necessitate a four-pronged paradigm shift. No mathematical derivations, equations, fitted parameters, or self-referential definitions appear in the provided text. The central inference to post-XAI directions follows from logical examination of existing practices rather than any reduction of outputs to inputs by construction. Self-citations, if present among the large author list, are not shown to be load-bearing for the root-cause diagnosis or the shift proposal. The argument is self-contained against external benchmarks in the XAI literature and does not exhibit any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 2 invented entities

The central claim rests on the premise that XAI limitations are structural and require a full paradigm shift, drawing from cross-disciplinary conceptual analysis without new empirical data or formal proofs.

axioms (2)

domain assumption Post-hoc explanations cannot provide certification of AI system performance.
Invoked to justify the shift to verification-focused IAI.
domain assumption AI requires rigorous scientific foundations analogous to established sciences.
Basis for the AI Epistemology component.

invented entities (2)

verification-focused Interactive AI (IAI) no independent evidence
purpose: Establish scientific community protocols for certifying AI system performance rather than post-hoc explanations.
One of the four proposed research directions introduced as alternative to current XAI.
User-Sensible AI no independent evidence
purpose: Create context-aware systems tailored to specific user communities.
One of the four proposed research directions.

pith-pipeline@v0.9.0 · 5757 in / 1623 out tokens · 59468 ms · 2026-05-15T18:47:47.279359+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Knee-xRAI: An Explainable AI Framework for Automatic Kellgren-Lawrence Grading of Knee Osteoarthritis
cs.CV 2026-04 unverdicted novelty 6.0

Knee-xRAI independently quantifies JSN, osteophytes, and sclerosis then fuses them into auditable classifiers reaching test QWK 0.8436 on 8260 radiographs.

Reference graph

Works this paper leans on

271 extracted references · 271 canonical work pages · cited by 1 Pith paper · 15 internal anchors

[1]

Reviewing the need for explainable artificial intelligence (xai).arXiv preprint arXiv:2012.01007, 2020

Julie Gerlings, Arisa Shollo, and Ioanna Constantiou. Reviewing the need for explainable artificial intelligence (xai).arXiv preprint arXiv:2012.01007, 2020

work page arXiv 2012
[2]

Palikhe, Zhenyu Yu, Zichong Wang, and Wenbin Zhang

A. Palikhe, Zhenyu Yu, Zichong Wang, and Wenbin Zhang. Towards transparent ai: A survey on explainable large language models. arXiv preprint arXiv:2506.21812, 2025. 33 Beyond Explainable AI (XAI): An Overdue Paradigm Shift and Post-XAI Research Directions

work page arXiv 2025
[3]

Explainable and interpretable multimodal large language models: A comprehensive survey.arXiv preprint arXiv:2412.02104, 2024

Yunkai Dang, Kaichen Huang, Jiahao Huo, Yibo Yan, Sirui Huang, Dongrui Liu, Mengxi Gao, Jie Zhang, Chen Qian, Kun Wang, et al. Explainable and interpretable multimodal large language models: A comprehensive survey.arXiv preprint arXiv:2412.02104, 2024

work page arXiv 2024
[4]

Symbolic vs sub-symbolic ai methods: Friends or enemies? InCIKM (Workshops), volume 2699, 2020

Eleni Ilkou and Maria Koutraki. Symbolic vs sub-symbolic ai methods: Friends or enemies? InCIKM (Workshops), volume 2699, 2020

work page 2020
[6]

XAI is in trouble.AI Magazine, 45(3):300–316, 2024

Rosina O Weber, Adam J Johs, Prateek Goel, and João Marques Silva. XAI is in trouble.AI Magazine, 45(3):300–316, 2024

work page 2024
[7]

Darpa’s explainable artificial intelligence (xai) program.AI magazine, 40(2):44–58, 2019

David Gunning and David Aha. Darpa’s explainable artificial intelligence (xai) program.AI magazine, 40(2):44–58, 2019

work page 2019
[8]

why should i trust you?

Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. " why should i trust you?" explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pages 1135–1144, 2016

work page 2016
[9]

Grad-cam: Visual explanations from deep networks via gradient-based localization

Ramprasaath R Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. Grad-cam: Visual explanations from deep networks via gradient-based localization. InProceedings of the IEEE international conference on computer vision, pages 618–626, 2017

work page 2017
[10]

Addison-Wesley Longman Publishing Co., Inc., 1984

Bruce G Buchanan and Edward H Shortliffe.Rule based expert systems: the mycin experiments of the stanford heuristic programming project (the Addison-Wesley series in artificial intelligence). Addison-Wesley Longman Publishing Co., Inc., 1984

work page 1984
[11]

Exact rule learning via boolean compressed sensing

Dmitry Malioutov and Kush Varshney. Exact rule learning via boolean compressed sensing. InInternational conference on machine learning, pages 765–773. PMLR, 2013

work page 2013
[12]

University of Michigan Press, 1988

Alice R Burks and Arthur Walter Burks.The first electronic computer: The Atanasoff story. University of Michigan Press, 1988

work page 1988
[13]

Case-basedexplanationofnon-case-based learning methods

RichCaruana, HooshangKangarloo, JohnDavidDionisio, UshaSinha, andDavidJohnson. Case-basedexplanationofnon-case-based learning methods. InProceedings of the AMIA Symposium, page 212, 1999

work page 1999
[14]

Baehrens, T

D. Baehrens, T. Fiddike, S. Harmeling, M. Kawanabe, K. Hansen, and K.-R. Müller. How to explain individ- ual classification decisions.Journal of Machine Learning Research, 11, 2009. URL files/12518/Baehrensetal. -2009-HowtoExplainIndividualClassificationDecisions.pdf

work page 2009
[15]

Matthew Weinberg

Stefan Haufe, Frank Meinecke, Kai Görgen, Sven Dähne, John-Dylan Haynes, Benjamin Blankertz, and Felix Bießmann. On the interpretation of weight vectors of linear models in multivariate neuroimaging.Neuroimage, 87:96–110, 2014. doi: 10.1016/j. neuroimage.2013.10.067

work page doi:10.1016/j 2014
[16]

D. Gunning. Explainable artificial intelligence (xai), 2017. 2(2), 1

work page 2017
[17]

Gunning and D

D. Gunning and D. Aha. Darpa’s explainable artificial intelligence (xai) program.AI Mag, 40(2):44–58, 2019. doi: 10.1609/aimag. v40i2.2850

work page doi:10.1609/aimag 2019
[18]

Gunning, E

D. Gunning, E. Vorm, J. Y. Wang, and M. Turek. Darpa’s explainable ai (xai) program: A retrospective.Applied AI Letters, 2(4):e61,

work page
[19]

doi: 10.1002/ail2.61

work page doi:10.1002/ail2.61
[20]

European union regulations on algorithmic decision making and a ‘right to explanation’, Oct 2025

work page 2025
[21]

R. H. J. and I. S. Hamon. Robustness and explainability of artificial intelligence, 2020

work page 2020
[22]

Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead

Cynthia Rudin. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1(5):206–215, 2019. doi: 10.1038/s42256-019-0048-x

work page doi:10.1038/s42256-019-0048-x 2019
[23]

Swamy, J

V. Swamy, J. Frej, and T. Käser. Viewpoint: The future of human-centric explainable artificial intelligence (xai) is not post-hoc explanations.Journal of Artificial Intelligence Research, 84, 2025. doi: 10.1613/jair.1.17970

work page doi:10.1613/jair.1.17970 2025
[24]

The Mythos of Model Interpretability

Zachary C Lipton. The mythos of model interpretability.arXiv, 2017. doi: 10.48550/arXiv.1606.03490

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1606.03490 2017
[25]

Explainableaiisdead, longliveexplainableai! hypothesis-drivendecisionsupport.arXiv

T.Miller. Explainableaiisdead, longliveexplainableai! hypothesis-drivendecisionsupport.arXiv. doi: 10.48550/arXiv.2302.12389

work page doi:10.48550/arxiv.2302.12389
[26]

R. R. et al. Hoffman. Evaluating machine-generated explanations: a ‘scorecard’ method for xai measurement science.Front Comput Sci, 2023

work page 2023
[27]

T. Miller. Explanation in artificial intelligence: Insights from the social sciences.Artif Intell, 267:1–38, 2019. doi: 10.1016/j.artint. 2018.07.007

work page doi:10.1016/j.artint 2019
[30]

Linardatos, V

P. Linardatos, V. Papastefanopoulos, and S. Kotsiantis. Explainable ai: A review of machine learning interpretability methods. Entropy, 23(1):18, 2021. doi: 10.3390/e23010018

work page doi:10.3390/e23010018 2021
[31]

S. et al. Ali. Explainable artificial intelligence (xai): What we know and what is left to attain trustworthy artificial intelligence. Information Fusion, 99:101805, 2023. doi: 10.1016/j.inffus.2023.101805

work page doi:10.1016/j.inffus.2023.101805 2023
[32]

Interpretable machine learning: Fundamental principles and 10 grand challenges.Statistics Surveys, 16(none):1 – 85, 2022

Cynthia Rudin, Chaofan Chen, Zhi Chen, Haiyang Huang, Lesia Semenova, and Chudi Zhong. Interpretable machine learning: Fundamental principles and 10 grand challenges.Statistics Surveys, 16(none):1 – 85, 2022. doi: 10.1214/21-SS133. URL https://doi.org/10.1214/21-SS133

work page doi:10.1214/21-ss133 2022
[33]

Zhong and E

J. Zhong and E. Negre. Ai: To interpret or to explain? InINFORSID, 2021. URLhttps://www.semanticscholar.org/paper/ AI%3A-To-interpret-or-to-explain-Zhong-Negre/aea5940e91ccf37676b8c3a3273356752b9b680b

work page 2021
[34]

P. P. Angelov, E. A. Soares, R. Jiang, N. I. Arnold, and P. M. Atkinson. Explainable artificial intelligence: an analytical review.WIREs Data Mining and Knowledge Discovery, 11(5):e1424, 2021. doi: 10.1002/widm.1424

work page doi:10.1002/widm.1424 2021
[35]

L. K. Hansen and L. Rieger. Interpretability in intelligent systems – a new concept? InExplainable AI: Interpreting, Explaining and Visualizing Deep Learning, pages 41–49. Springer. doi: 10.1007/978-3-030-28954-6_3

work page doi:10.1007/978-3-030-28954-6_3
[36]

A. Hare. Explainable vs interpretable ai: An intuitive example. Medium, 2020

work page 2020
[37]

T. De, P. Giri, A. Mevawala, R. Nemani, and A. Deo. Explainable ai: A hybrid approach to generate human-interpretable explanation for deep learning prediction.Procedia Comput Sci, 168:40–48, 2020. doi: 10.1016/j.procs.2020.02.255

work page doi:10.1016/j.procs.2020.02.255 2020
[38]

Fazzinga, S

B. Fazzinga, S. Flesca, F. Furfaro, and L. Pontieri. Process mining meets argumentation: Explainable interpretations of low-level event logs via abstract argumentation.Inf Syst, 107:101987, 2022. doi: 10.1016/j.is.2022.101987

work page doi:10.1016/j.is.2022.101987 2022
[39]

Laato, M

S. Laato, M. Tiainen, A. K. M. N. Islam, and M. Mäntymäki. How to explain ai systems to end users: a systematic literature review and research agenda.Internet Research, 32(7):1–31, 2022. doi: 10.1108/INTR-08-2021-0600

work page doi:10.1108/intr-08-2021-0600 2022
[40]

J.-Y. et al. Gwak. Debugging malware classification models based on event logs with explainable ai. In2023 IEEE International Conference on Data Mining Workshops (ICDMW), pages 939–948, 2023. doi: 10.1109/ICDMW60847.2023.00125

work page doi:10.1109/icdmw60847.2023.00125 2023
[41]

Fresz, E

B. Fresz, E. Dubovitskaya, D. Brajovic, M. F. Huber, and C. Horz. How should ai decisions be explained? requirements for explanations from the perspective of european law.Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 7(1):438–450,

work page
[42]

doi: 10.1609/aies.v7i1.31648

work page doi:10.1609/aies.v7i1.31648
[43]

Adadi and M

A. Adadi and M. Berrada. Peeking inside the black-box: A survey on explainable artificial intelligence (xai).IEEE Access, 6: 52138–52160, 2018. doi: 10.1109/ACCESS.2018.2870052

work page doi:10.1109/access.2018.2870052 2018
[44]

L. H. Gilpin, D. Bau, B. Z. Yuan, A. Bajwa, M. Specter, and L. Kagal. Explaining explanations: An overview of interpretability of machine learning. In2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA), pages 80–89, 2018. doi: 10.1109/DSAA.2018.00018

work page doi:10.1109/dsaa.2018.00018 2018
[45]

R. R. Hoffman, S. T. Mueller, G. Klein, and J. Litman. Metrics for explainable ai: Challenges and prospects.arXiv, 2019. doi: 10.48550/arXiv.1812.04608

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1812.04608 2019
[46]

Ehsan and M

U. Ehsan and M. O. Riedl. Human-centered explainable ai: Towards a reflective sociotechnical approach. InHCI International 2020 - Late Breaking Papers: Multimodality and Intelligence, pages 449–466. Springer, 2020. doi: 10.1007/978-3-030-60117-1_33

work page doi:10.1007/978-3-030-60117-1_33 2020
[48]

S. T. Mueller, R. R. Hoffman, W. Clancey, A. Emrey, and G. Klein. Explanation in human-ai systems: A literature meta-review, synopsis of key ideas and publications, and bibliography for explainable ai.arXiv, 2019. doi: 10.48550/arXiv.1902.01876

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1902.01876 2019
[49]

Explainable artificial intelligence (xai): Concepts, taxonomies, opportunities and challenges toward responsible ai.Information fusion, 58:82–115, 2020

Alejandro Barredo Arrieta, Natalia Díaz-Rodríguez, Javier Del Ser, Adrien Bennetot, Siham Tabik, Alberto Barbado, Salvador García, Sergio Gil-López, Daniel Molina, Richard Benjamins, et al. Explainable artificial intelligence (xai): Concepts, taxonomies, opportunities and challenges toward responsible ai.Information fusion, 58:82–115, 2020

work page 2020
[50]

M. A. Clinciu and H. F. Hastie. A survey of explainable ai terminology. InProceedings of the 1st Workshop on Interactive Natural Language Technology for Explainable Artificial Intelligence, pages 8–13. Association for Computational Linguistics. doi: 10.18653/ v1/W19-8403

work page
[51]

P. A. Moreno-Sánchez. Improvement of a prediction model for heart failure survival through explainable artificial intelligence. Frontiers in Cardiovascular Medicine, 10, 2023. doi: 10.3389/fcvm.2023.1219586. 35 Beyond Explainable AI (XAI): An Overdue Paradigm Shift and Post-XAI Research Directions

work page doi:10.3389/fcvm.2023.1219586 2023
[52]

S. Baron. Explainable ai and causal understanding: Counterfactual approaches considered.Minds and Machines, 33(2):347–377,

work page
[53]

doi: 10.1007/s11023-023-09637-x

work page doi:10.1007/s11023-023-09637-x
[54]

Garcia-Olano, Y

D. Garcia-Olano, Y. Onoe, J. Ghosh, and B. Wallace. Intermediate entity-based sparse interpretable representation learning. On- line,2022. URL https://www.researchgate.net/publication/366027059_Intermediate_Entity-based_Sparse_ Interpretable_Representation_Learning

work page arXiv 2022
[55]

Unmasking clever hans predictors and assessing what machines really learn.Nature communications, 10(1):1096, 2019

Sebastian Lapuschkin, Stephan Wäldchen, Alexander Binder, Grégoire Montavon, Wojciech Samek, and Klaus-Robert Müller. Unmasking clever hans predictors and assessing what machines really learn.Nature communications, 10(1):1096, 2019

work page 2019
[56]

D. Minh, H. X. Wang, Y. F. Li, and T. N. Nguyen. Explainable artificial intelligence: a comprehensive review.Artificial Intelligence Review, 55(5):3503–3568. doi: 10.1007/s10462-021-10088-y

work page doi:10.1007/s10462-021-10088-y
[57]

J. M. D.-A. Schoeffer and N. Kuehl. Explanations, fairness, and appropriate reliance in human-ai decision-making. InProceedings of the 2024 CHI Conference on Human Factors in Computing Systems, 2024

work page 2024
[58]

Leichtmann, C

B. Leichtmann, C. Humer, A. Hinterreiter, M. Streit, and M. Mara. Effects of explainable artificial intelligence on trust and human behavior in a high-risk decision task.Computers in Human Behavior, 139:107539. doi: 10.1016/j.chb.2022.107539

work page doi:10.1016/j.chb.2022.107539 2022
[59]

Weitz, D

K. Weitz, D. Schiller, R. Schlagowski, T. Huber, and E. André. ‘Do you trust me?’: Increasing User-Trust by Integrating Virtual Agents in Explainable AI Interaction Design. InProceedings of the 19th ACM International Conference on Intelligent Virtual Agents (IVA ’19), pages 7–9. Association for Computing Machinery. doi: 10.1145/3308532.3329441

work page doi:10.1145/3308532.3329441
[60]

Ferrario and M

A. Ferrario and M. Loi. How explainability contributes to trust in ai. InProceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency (FAccT ’22), pages 1457–1466. Association for Computing Machinery. doi: 10.1145/3531146. 3533202

work page doi:10.1145/3531146 2022
[62]

Poursabzi-Sangdeh, D

F. Poursabzi-Sangdeh, D. G. Goldstein, J. M. Hofman, J. W. Wortman Vaughan, and H. Wallach. Manipulating and measuring model interpretability. InCHI ’21, pages 1–52. Association for Computing Machinery, 2021. doi: 10.1145/3411764.3445315

work page doi:10.1145/3411764.3445315 2021
[63]

I-trustworthy models

Ritwik Vashistha and Arya Farahi. I-trustworthy models. a framework for trustworthiness evaluation of probabilistic classifiers. In Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, volume 258 ofProceedings of Machine Learning Research, pages 4726–4734. PMLR, 03–05 May 2025. URLhttps://proceedings.mlr.press/v258/v...

work page 2025
[64]

U-trustworthy models

Ritwik Vashistha and Arya Farahi. U-trustworthy models. reliability, competence, and confidence in decision-making. InProceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 19956–19964, 2024

work page 2024
[65]

Ghassemi, L

M. Ghassemi, L. Oakden-Rayner, and A. L. Beam. The false hope of current approaches to explainable artificial intelligence in health care.The Lancet Digital Health, 3(1):745–750, 2021

work page 2021
[66]

Woodward and L

J. Woodward and L. Ross. Scientific explanation. In E. N. Zalta, editor,The Stanford Encyclopedia of Philosophy, Summer 2021. Metaphysics Research Lab, Stanford University, 2021. URLhttps://plato.stanford.edu/archives/sum2021/entries/ scientific-explanation/

work page 2021
[67]

Understanding the impact of explanations on advice-taking: a user study for ai-based clinical decision support systems

Cecilia Panigutti, Andrea Beretta, Fosca Giannotti, and Dino Pedreschi. Understanding the impact of explanations on advice-taking: a user study for ai-based clinical decision support systems. InProceedings of the 2022 CHI Conference on Human Factors in Computing Systems, pages 1–9, 2022

work page 2022
[68]

How machine-learning recommendations influence clinician treatment selections: the example of antidepressant selection.Translational psychiatry, 11(1): 108, 2021

Maia Jacobs, Melanie F Pradier, Thomas H McCoy Jr, Roy H Perlis, Finale Doshi-Velez, and Krzysztof Z Gajos. How machine-learning recommendations influence clinician treatment selections: the example of antidepressant selection.Translational psychiatry, 11(1): 108, 2021

work page 2021
[69]

S. D. F. S. S. T. S. V. E. A. K. N. B. J. W. and M. W. S. Jabbour. Measuring the impact of ai in the diagnosis of hospitalized patients. JAMA, 330(23), 2023

work page 2023
[70]

Fazelpour and D

S. Fazelpour and D. Danks. Algorithmic bias: Senses, sources, solutions.Philosophy Compass, 16(8):e12760, 2021. doi: 10.1111/phc3.12760

work page doi:10.1111/phc3.12760 2021
[71]

Passi and M

S. Passi and M. Vorvoreanu. Overreliance on ai: Literature review. Online, 2022. URLhttps://www.microsoft.com/en-us/ research/publication/overreliance-on-ai-literature-review/

work page 2022
[72]

S. A. A. M. E. K. M. A. H. Afroogh. Trust in ai: progress, challenges, and future directions.Humanities and Social Sciences Communications, 11(1):1–30, 2024

work page 2024
[73]

S. S. Kim, Jennifer Wortman Vaughan, Q. Vera Liao, Tania Lombrozo, and Olga Russakovsky. Fostering appropriate reliance on large language models: The role of explanations, sources, and inconsistencies. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems, 2025. 36 Beyond Explainable AI (XAI): An Overdue Paradigm Shift and Post-XA...

work page 2025
[74]

E. et al. Yadollahi. Expectations, explanations, and embodiment: Attempts at robot failure recovery.arXiv, 2025. doi: 10.48550/ arXiv.2504.07266

work page arXiv 2025
[76]

Wilming, L

R. Wilming, L. Kieslich, B. Clark, and S. Haufe. Theoretical behavior of XAI methods in the presence of suppressor variables. In International Conference on Machine Learning, pages 37091–37107. PMLR, 2023. URLhttps://proceedings.mlr.press/ v202/wilming23a.html

work page 2023
[78]

Jacovi and Y

A. Jacovi and Y. Goldberg. Towards faithfully interpretable nlp systems: How should we define and evaluate faithfulness? In D. Jurafsky, J. Chai, N. Schluter, and J. Tetreault, editors,Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL 2020), pages 4198–4205. Association for Computational Linguistics, 2020. doi: 1...

work page doi:10.18653/v1/2020.acl-main.386 2020
[79]

Q. Lyu, M. Apidianaki, and C. Callison-Burch. Towards faithful model explanation in nlp: A survey.Computational Linguistics, 50 (2):657–723, 2024. doi: 10.1162/coli_a_00511

work page doi:10.1162/coli_a_00511 2024
[80]

S. T. Mueller, R. R. Hoffman, W. Clancey, A. Emrey, and G. Klein. Darpa xai literature review: Explanation in human-ai systems. a literature meta-review, synopsis of key ideas and publications, and bibliography for explainable ai. prepared by task area 2 institute for human and machine cognition, 2019. Macrocognition

work page 2019
[81]

Rastogi, S

N. Rastogi, S. Pant, D. Dhanuka, A. Saxena, and P. Mairal. Too much to trust? measuring the security and cognitive impacts of explainability in ai-driven socs.arXiv, 2025. doi: 10.48550/arXiv.2503.02065

work page doi:10.48550/arxiv.2503.02065 2025
[82]

B. M. Muir. Trust in automation: Part i. theoretical issues in the study of trust and human intervention in automated systems. Ergonomics, 37(11):1905–1922, 1994. doi: 10.1080/00140139408964957

work page doi:10.1080/00140139408964957 1905
[83]

H. A. Abbass. Social integration of artificial intelligence: Functions, automation allocation logic and human-autonomy trust. Cognitive Computation, 11(2):159–171, 2019. doi: 10.1007/s12559-018-9619-0

work page doi:10.1007/s12559-018-9619-0 2019
[84]

H. et al. Do. Facilitating human-llm collaboration through factuality scores and source attributions. Online, 2024. URLfiles/ 12615/Doetal.-2024-FacilitatingHuman-LLMCollaborationthroughFactu.pdf

work page 2024
[85]

Dietvorst, J

B. Dietvorst, J. Simmons, and C. Massey. Algorithm aversion: People erroneously avoid algorithms after seeing them err.Journal of Experimental Psychology: General, 144. doi: 10.1037/xge0000033

work page doi:10.1037/xge0000033
[86]

S. M. Jones-Jang and Y. J. Park. How do people react to ai failure? automation bias, algorithmic aversion, and perceived controllability.Journal of Computer-Mediated Communication, 28(1):zmac029, 2022. doi: 10.1093/jcmc/zmac029

work page doi:10.1093/jcmc/zmac029 2022
[87]

Rastogi, Y

C. Rastogi, Y. Zhang, D. Wei, K. R. Varshney, A. Dhurandhar, and R. Tomsett. Deciding fast and slow: The role of cognitive biases in ai-assisted decision-making.Proceedings of the ACM on Human-Computer Interaction, 6(CSCW1):83:1–83:22. doi: 10.1145/3512930

work page doi:10.1145/3512930

Showing first 80 references.