"Why Should You Trust My Explanation?" Understanding Uncertainty in LIME Explanations
read the original abstract
Methods for interpreting machine learning black-box models increase the outcomes' transparency and in turn generates insight into the reliability and fairness of the algorithms. However, the interpretations themselves could contain significant uncertainty that undermines the trust in the outcomes and raises concern about the model's reliability. Focusing on the method "Local Interpretable Model-agnostic Explanations" (LIME), we demonstrate the presence of two sources of uncertainty, namely the randomness in its sampling procedure and the variation of interpretation quality across different input data points. Such uncertainty is present even in models with high training and test accuracy. We apply LIME to synthetic data and two public data sets, text classification in 20 Newsgroup and recidivism risk-scoring in COMPAS, to support our argument.
This paper has not been read by Pith yet.
Forward citations
Cited by 2 Pith papers
-
Do Fair Models Reason Fairly? Counterfactual Explanation Consistency for Procedural Fairness in Credit Decisions
Outcome-fair credit models often exhibit hidden procedural bias through inconsistent reasoning across groups, which the CEC framework mitigates by enforcing consistent feature attributions via counterfactuals.
-
Fairness of Explanations in Artificial Intelligence (AI): A Unifying Framework, Axioms, and Future Direction toward Responsible AI
A conditional invariance framework defines explanation fairness as explanations being statistically independent of protected attributes given task-relevant features, unifying existing metrics and enabling procedural b...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.