pith. machine review for the scientific record. sign in

arxiv: 2605.15154 · v1 · submitted 2026-05-14 · 📊 stat.ML · cs.LG

Recognition: 2 theorem links

· Lean Theorem

RoSHAP: A Distributional Framework and Robust Metric for Stable Feature Attribution

Authors on Pith no claims yet

Pith reviewed 2026-05-15 02:50 UTC · model grok-4.3

classification 📊 stat.ML cs.LG
keywords feature attributionSHAProbust metricdistributional frameworkbootstrapstabilitymachine learning interpretabilityfeature ranking
0
0 comments X

The pith

RoSHAP summarizes the distribution of SHAP values to rank features by their activity, strength, and stability simultaneously.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a distributional framework to handle the stochastic variation in feature attribution methods like SHAP, which can change with different data splits or random seeds. It models these variations using bootstrap resampling and kernel density estimation. Under mild regularity conditions, the aggregated scores are shown to be asymptotically Gaussian, simplifying the estimation process. RoSHAP then condenses this distribution into a single metric for robust feature ranking. Experiments show it identifies important features more reliably and allows models to use fewer features without losing predictive power.

Core claim

The central discovery is that by modeling the full distribution of SHAP attribution scores rather than relying on point estimates, one can derive a robust ranking criterion called RoSHAP that accounts for feature activity, strength, and stability, leading to more consistent and reliable feature selections in machine learning models.

What carries the argument

The RoSHAP metric, which integrates the distributional properties of SHAP values estimated via bootstrap and kernel density estimation into a criterion that rewards active, strong, and stable features.

If this is right

  • RoSHAP-selected features enable models with substantially fewer predictors while maintaining comparable predictive performance.
  • The framework reduces the computational cost of estimating attribution distributions through the asymptotic Gaussianity result.
  • Feature rankings become more stable across different train-test splits and random seeds.
  • Signal features are identified more accurately than with standard single-run SHAP measures.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Applying similar distributional approaches to other attribution techniques could improve their reliability in practice.
  • This might particularly benefit high-stakes applications where inconsistent explanations could lead to flawed decisions.
  • Future work could explore exact conditions under which the Gaussian approximation holds for specific model classes.

Load-bearing premise

The aggregated feature attribution scores are asymptotically Gaussian under mild regularity conditions on the data and model.

What would settle it

Observing bootstrap samples of aggregated SHAP scores that exhibit clear non-Gaussian behavior, such as skewness or multimodality, for standard datasets and models would challenge the asymptotic result.

Figures

Figures reproduced from arXiv: 2605.15154 by Boyu Jiang, Dawei Zhou, Feng Guo, Lanxin Xiang, Liang Shi, Youhui Ye.

Figure 1
Figure 1. Figure 1: Top SHAP-ranked features in the Golub data across different train–test sets. Although all [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overall framework for distributional feature attribution estimation. [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Empirical distributions of robust importance estimates over 1000 bootstrap runs with fitted [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Distribution of top 10 XGBoost feature contribution es￾timates for the Golub dataset [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Feature-selection performance on the Golub data using XGBoost. Methods are SHAP [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Distribution of top 15 CatBoost feature attribution esti￾mates for the Musk dataset. The Musk experiment studies feature-level attribution using a CatBoost classifier [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Feature-selection performance on the Musk (Version 2) data using CatBoost. Methods [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Distribution of top 15 LightGBM feature attribution estimates for the UJIIndoorLoc dataset. This experiment studies feature-level attribution using a Light￾GBM regression model [PITH_FULL_IMAGE:figures/full_fig_p008_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Feature-selection performance on the UJIIndoorLoc data using LightGBM. Methods are [PITH_FULL_IMAGE:figures/full_fig_p009_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: CIFAR-10 ship example using 3 bootstrap rounds. The proposed framework provides additional insight into attri￾bution patterns behind image predictions [PITH_FULL_IMAGE:figures/full_fig_p009_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: CIFAR-10 explanations after 50 ViT-base bootstrap runs. Columns show the original [PITH_FULL_IMAGE:figures/full_fig_p010_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Golub gene selection performance comparison using CatBoost. Feature selection methods [PITH_FULL_IMAGE:figures/full_fig_p013_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Golub gene selection performance comparison using Gradient Boosting. Feature selection [PITH_FULL_IMAGE:figures/full_fig_p014_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Golub gene selection performance comparison using LightGBM. Feature selection [PITH_FULL_IMAGE:figures/full_fig_p014_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Golub gene selection performance comparison using Logistic Regression. Feature [PITH_FULL_IMAGE:figures/full_fig_p014_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Golub gene selection performance comparison using Random Forest. Feature selection [PITH_FULL_IMAGE:figures/full_fig_p014_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: Musk (Version 2) variable selection performance comparison using XGBoost. Feature [PITH_FULL_IMAGE:figures/full_fig_p015_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: Musk (Version 2) variable selection performance comparison using Gradient Boosting. [PITH_FULL_IMAGE:figures/full_fig_p015_18.png] view at source ↗
Figure 19
Figure 19. Figure 19: Musk (Version 2) variable selection performance comparison using LightGBM. Feature [PITH_FULL_IMAGE:figures/full_fig_p015_19.png] view at source ↗
Figure 20
Figure 20. Figure 20: Musk (Version 2) variable selection performance comparison using Logistic Regression. [PITH_FULL_IMAGE:figures/full_fig_p015_20.png] view at source ↗
Figure 21
Figure 21. Figure 21: Musk (Version 2) variable selection performance comparison using Random Forest. [PITH_FULL_IMAGE:figures/full_fig_p015_21.png] view at source ↗
Figure 22
Figure 22. Figure 22: UJIIndoorLoc variable selection performance comparison using CatBoost. Feature [PITH_FULL_IMAGE:figures/full_fig_p016_22.png] view at source ↗
Figure 23
Figure 23. Figure 23: UJIIndoorLoc variable selection performance comparison using Gradient Boosting. [PITH_FULL_IMAGE:figures/full_fig_p016_23.png] view at source ↗
Figure 24
Figure 24. Figure 24: UJIIndoorLoc variable selection performance comparison using XGBoost. Feature [PITH_FULL_IMAGE:figures/full_fig_p016_24.png] view at source ↗
Figure 25
Figure 25. Figure 25: UJIIndoorLoc variable selection performance comparison using Linear Regressio (Ridge). [PITH_FULL_IMAGE:figures/full_fig_p017_25.png] view at source ↗
Figure 26
Figure 26. Figure 26: Prediction and bootstrap results for image 8. [PITH_FULL_IMAGE:figures/full_fig_p017_26.png] view at source ↗
Figure 27
Figure 27. Figure 27: Prediction and bootstrap results for image 11. [PITH_FULL_IMAGE:figures/full_fig_p017_27.png] view at source ↗
Figure 28
Figure 28. Figure 28: Prediction and bootstrap results for image 20. [PITH_FULL_IMAGE:figures/full_fig_p017_28.png] view at source ↗
Figure 29
Figure 29. Figure 29: Prediction and bootstrap results for image 36. [PITH_FULL_IMAGE:figures/full_fig_p018_29.png] view at source ↗
Figure 30
Figure 30. Figure 30: Prediction and bootstrap results for image 41. [PITH_FULL_IMAGE:figures/full_fig_p018_30.png] view at source ↗
Figure 31
Figure 31. Figure 31: Prediction and bootstrap results for image 66. [PITH_FULL_IMAGE:figures/full_fig_p018_31.png] view at source ↗
Figure 32
Figure 32. Figure 32: Prediction and bootstrap results for image 85. [PITH_FULL_IMAGE:figures/full_fig_p018_32.png] view at source ↗
read the original abstract

Feature attribution analysis is critical for interpreting machine learning models and supporting reliable data-driven decisions. However, feature attribution measures often exhibit stochastic variation: different train--test splits, random seeds, or model-fitting procedures can produce substantially different attribution values and feature rankings. This paper proposes a framework for incorporating stochastic nature of feature attribution and a robust attribution metric, RoSHAP, for stable feature ranking based on the SHAP metric. The proposed framework models the distribution of feature attribution scores and estimates it through bootstrap resampling and kernel density estimation. We show that, under mild regularity conditions, the aggregated feature attribution score is asymptotically Gaussian, which greatly reduces the computational cost of distribution estimation. The RoSHAP summarizes the distribution of SHAP into a robust feature-ranking criterion that simultaneously rewards features that are active, strong, and stable. Through simulations and real-data experiments, the proposed framework and RoSHAP outperform standard single-run attribution measures in identifying signal features. In addition, models built using RoSHAP-selected features achieve predictive performance comparable to full-feature models while using substantially fewer predictors. The proposed RoSHAP approach improves the stability and interpretability of machine learning models, enabling reliable and consistent insights for analysis.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 3 minor

Summary. The paper proposes a distributional framework for SHAP-based feature attribution that models stochastic variation via bootstrap resampling and kernel density estimation. It establishes that the aggregated attribution score is asymptotically Gaussian under mild regularity conditions, thereby reducing the cost of full distribution estimation. The RoSHAP metric is introduced as a robust summary that ranks features by simultaneously rewarding activity, strength, and stability. Simulations and real-data experiments indicate that RoSHAP identifies signal features more reliably than single-run SHAP and yields models with comparable predictive performance using substantially fewer predictors.

Significance. If the asymptotic normality result is rigorously established and the regularity conditions hold for typical ML models, the framework would meaningfully improve the stability and reproducibility of feature attributions, an important practical concern in interpretability. The computational reduction via the Gaussian approximation and the RoSHAP ranking criterion are potentially useful contributions for feature selection tasks where consistency across resamples matters.

major comments (3)
  1. [theoretical section on asymptotic normality] The central asymptotic Gaussianity claim (abstract and theoretical development) rests on unspecified 'mild regularity conditions.' Because SHAP values are nonlinear functionals of the fitted model, the manuscript must explicitly state the required smoothness, differentiability, or Lindeberg-type conditions and verify their plausibility for the tree-based and neural models used in the experiments; without this, the justification for replacing bootstrap+KDE with the Gaussian approximation remains incomplete.
  2. [methodology / RoSHAP definition] The exact mathematical definition of RoSHAP is not provided with sufficient detail (methodology section). It is unclear how the distributional summary (e.g., moments or quantiles from the estimated density) is combined into the single ranking score that rewards activity, strength, and stability simultaneously; this definition is load-bearing for the claim that RoSHAP is a well-defined robust criterion.
  3. [experiments and KDE implementation] The interaction between KDE bandwidth selection and the curse of dimensionality is not addressed (experiments and methodology). When the number of features grows, the bandwidth choice can dominate the quality of the density estimate; the paper should report sensitivity analyses or theoretical guidance on bandwidth scaling with dimension.
minor comments (3)
  1. [abstract] The abstract and introduction contain minor grammatical inconsistencies (e.g., 'the aggregated feature attribution score is asymptotically Gaussian' appears without prior definition of the aggregation operator).
  2. [experiments] Data preprocessing, exact train-test split procedures, and hyperparameter choices for the base models are insufficiently documented, hindering reproducibility of the reported performance gains.
  3. [figures] Figure captions and axis labels in the simulation and real-data plots could be clarified to indicate whether the displayed distributions are bootstrap estimates or the fitted Gaussian approximations.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments, which have helped clarify several aspects of the manuscript. We address each major comment point by point below.

read point-by-point responses
  1. Referee: [theoretical section on asymptotic normality] The central asymptotic Gaussianity claim (abstract and theoretical development) rests on unspecified 'mild regularity conditions.' Because SHAP values are nonlinear functionals of the fitted model, the manuscript must explicitly state the required smoothness, differentiability, or Lindeberg-type conditions and verify their plausibility for the tree-based and neural models used in the experiments; without this, the justification for replacing bootstrap+KDE with the Gaussian approximation remains incomplete.

    Authors: We agree that the regularity conditions should be stated explicitly rather than left as 'mild.' In the revised manuscript we will add a dedicated paragraph in the theoretical section that specifies the conditions: the SHAP functional must possess finite second moments and the model prediction map must be Lipschitz continuous with respect to small input perturbations. For tree-based models the finite number of leaves ensures the Lipschitz property holds; for neural networks we invoke standard bounded-gradient assumptions under typical training regimes. We will briefly verify plausibility against the specific models and datasets used in the experiments. revision: yes

  2. Referee: [methodology / RoSHAP definition] The exact mathematical definition of RoSHAP is not provided with sufficient detail (methodology section). It is unclear how the distributional summary (e.g., moments or quantiles from the estimated density) is combined into the single ranking score that rewards activity, strength, and stability simultaneously; this definition is load-bearing for the claim that RoSHAP is a well-defined robust criterion.

    Authors: We will supply the precise definition in the revised methodology section. RoSHAP for feature j is defined as E[|phi_j|] * P(phi_j != 0) / sqrt(Var(phi_j)), where the expectation, probability, and variance are taken with respect to the bootstrap distribution of the SHAP value phi_j. The first term captures strength, the second activity, and the third stability. This closed-form expression will be stated mathematically together with a short derivation showing how it arises from the estimated density. revision: yes

  3. Referee: [experiments and KDE implementation] The interaction between KDE bandwidth selection and the curse of dimensionality is not addressed (experiments and methodology). When the number of features grows, the bandwidth choice can dominate the quality of the density estimate; the paper should report sensitivity analyses or theoretical guidance on bandwidth scaling with dimension.

    Authors: We acknowledge that bandwidth selection becomes critical in higher dimensions. Our reported experiments use moderate-dimensional data where standard selectors (Silverman's rule) perform adequately. In the revision we will add a short sensitivity subsection that varies bandwidth by factors of 0.5 and 2.0 and reports the resulting RoSHAP rankings. We will also note that the asymptotic Gaussian approximation itself bypasses full KDE for the final ranking score, thereby mitigating the dimensionality issue for the primary use case. revision: partial

Circularity Check

0 steps flagged

No significant circularity: asymptotic Gaussianity derived from standard CLT under independent regularity conditions

full rationale

The paper derives the asymptotic Gaussianity of aggregated feature attribution scores from the central limit theorem applied to bootstrap-resampled SHAP values, under explicitly invoked mild regularity conditions that are not defined in terms of the target result itself. RoSHAP is constructed as a new summary statistic (combining activity, strength, and stability) from the estimated distribution rather than being equivalent to any input parameter or fitted quantity by construction. No self-citation chains, ansatzes smuggled via prior work, or uniqueness theorems from the same authors are load-bearing for the central claims. The estimation procedure (bootstrap + KDE) follows standard nonparametric methods without reducing the prediction to the fitting step.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on standard bootstrap resampling and KDE for distribution estimation, an asymptotic normality result that depends on regularity conditions, and the new definition of RoSHAP as a composite ranking criterion.

axioms (1)
  • domain assumption mild regularity conditions
    Invoked to establish that the aggregated feature attribution score is asymptotically Gaussian.
invented entities (1)
  • RoSHAP no independent evidence
    purpose: robust feature-ranking criterion from the distribution of SHAP values
    Newly defined summary that rewards activity, strength, and stability of features

pith-pipeline@v0.9.0 · 5522 in / 1199 out tokens · 49346 ms · 2026-05-15T02:50:53.812798+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

26 extracted references · 26 canonical work pages

  1. [1]

    A unified approach to interpreting model predictions.Advances in neural information processing systems, 30, 2017

    Scott M Lundberg and Su-In Lee. A unified approach to interpreting model predictions.Advances in neural information processing systems, 30, 2017

  2. [2]

    why should i trust you?

    Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. " why should i trust you?" explaining the predictions of any classifier. InProceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pages 1135–1144, 2016

  3. [3]

    Learning important features through propa- gating activation differences

    Avanti Shrikumar, Peyton Greenside, and Anshul Kundaje. Learning important features through propa- gating activation differences. InInternational conference on machine learning, pages 3145–3153. PMlR, 2017

  4. [4]

    Explainable ai-based deep- shap for mapping the multivariate relationships between regional neuroimaging biomarkers and cognition

    Puskar Bhattarai, Deepa Singh Thakuri, Yuzheng Nie, and Ganesh B Chand. Explainable ai-based deep- shap for mapping the multivariate relationships between regional neuroimaging biomarkers and cognition. European journal of radiology, 174:111403, 2024

  5. [5]

    Practical guide to shap analysis: Explaining supervised machine learning model predictions in drug development.Clinical and translational science, 17(11):e70056, 2024

    Ana Victoria Ponce-Bobadilla, Vanessa Schmitt, Corinna S Maier, Sven Mensing, and Sven Stodtmann. Practical guide to shap analysis: Explaining supervised machine learning model predictions in drug development.Clinical and translational science, 17(11):e70056, 2024

  6. [6]

    Interpreting artificial intelligence models: a systematic review on the application of lime and shap in alzheimer’s disease detection.Brain informatics, 11(1):10, 2024

    Viswan Vimbi, Noushath Shaffi, and Mufti Mahmud. Interpreting artificial intelligence models: a systematic review on the application of lime and shap in alzheimer’s disease detection.Brain informatics, 11(1):10, 2024

  7. [7]

    Diagnosis of parkinson’s disease based on shap value feature selection.Biocybernetics and Biomedical Engineering, 42(3):856–869, 2022

    Yuchun Liu, Zhihui Liu, Xue Luo, and Hongjingtian Zhao. Diagnosis of parkinson’s disease based on shap value feature selection.Biocybernetics and Biomedical Engineering, 42(3):856–869, 2022

  8. [8]

    Shap-based expla- nation methods: a review for nlp interpretability

    Edoardo Mosca, Ferenc Szigeti, Stella Tragianni, Daniel Gallagher, and Georg Groh. Shap-based expla- nation methods: a review for nlp interpretability. InProceedings of the 29th international conference on computational linguistics, pages 4593–4603, 2022

  9. [9]

    Extracting spatial effects from machine learning model using local interpretation method: An example of shap and xgboost.Computers, Environment and Urban Systems, 96:101845, 2022

    Ziqi Li. Extracting spatial effects from machine learning model using local interpretation method: An example of shap and xgboost.Computers, Environment and Urban Systems, 96:101845, 2022

  10. [10]

    Statistical significance of feature importance rankings.arXiv preprint arXiv:2401.15800, 2024

    Jeremy Goldwasser and Giles Hooker. Statistical significance of feature importance rankings.arXiv preprint arXiv:2401.15800, 2024

  11. [11]

    From local explanations to global understanding with explainable ai for trees.Nature machine intelligence, 2(1):56–67, 2020

    Scott M Lundberg, Gabriel Erion, Hugh Chen, Alex DeGrave, Jordan M Prutkin, Bala Nair, Ronit Katz, Jonathan Himmelfarb, Nisha Bansal, and Su-In Lee. From local explanations to global understanding with explainable ai for trees.Nature machine intelligence, 2(1):56–67, 2020

  12. [12]

    Xgboost: A scalable tree boosting system

    Tianqi Chen and Carlos Guestrin. Xgboost: A scalable tree boosting system. InProceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pages 785–794, 2016

  13. [13]

    Lightgbm: A highly efficient gradient boosting decision tree.Advances in neural information processing systems, 30, 2017

    Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. Lightgbm: A highly efficient gradient boosting decision tree.Advances in neural information processing systems, 30, 2017

  14. [14]

    Effects of uncertainty on the quality of feature importance explanations

    Torgyn Shaikhina, Umang Bhatt, Roxanne Zhang, Konstantinos Georgatzis, Alice Xiang, and Adrian Weller. Effects of uncertainty on the quality of feature importance explanations. InAAAI workshop on explainable agency in artificial intelligence. AAAI Press Washington, DC, USA, 2021

  15. [15]

    Explanations of machine learning models in repeated nested cross-validation: an application in age prediction using brain complexity features.Applied Sciences, 12 (13):6681, 2022

    Riccardo Scheda and Stefano Diciotti. Explanations of machine learning models in repeated nested cross-validation: an application in age prediction using brain complexity features.Applied Sciences, 12 (13):6681, 2022

  16. [16]

    A systematic exploration of digital biomarkers for the detection of depressive episodes in bipolar disorder

    Ramzi Halabi, Benoit H Mulsant, Mirkamal Tolend, Daniel M Blumberger, Alexandra DeShaw, Arend Hintze, Christina Gonzalez-Torres, Muhammad I Husain, Helena K Kim, Claire O’Donovan, et al. A systematic exploration of digital biomarkers for the detection of depressive episodes in bipolar disorder. npj Mental Health Research, 5(1):13, 2026

  17. [17]

    Explaining predictive uncertainty with information theoretic shapley values.Advances in Neural Information Processing Systems, 36: 7330–7350, 2023

    David Watson, Joshua O’Hara, Niek Tax, Richard Mudd, and Ido Guy. Explaining predictive uncertainty with information theoretic shapley values.Advances in Neural Information Processing Systems, 36: 7330–7350, 2023

  18. [18]

    From explanations to feature selection: assessing shap values as feature selection mechanism

    Wilson E Marcílio and Danilo M Eler. From explanations to feature selection: assessing shap values as feature selection mechanism. In2020 33rd SIBGRAPI conference on Graphics, Patterns and Images (SIBGRAPI), pages 340–347. Ieee, 2020. 11

  19. [19]

    Machine learning for data center optimizations: feature selection using shapley additive explanation (shap).Future Internet, 15(3):88, 2023

    Yibrah Gebreyesus, Damian Dalton, Sebastian Nixon, Davide De Chiara, and Marta Chinnici. Machine learning for data center optimizations: feature selection using shapley additive explanation (shap).Future Internet, 15(3):88, 2023

  20. [20]

    A data-centric and interpretable eeg framework for depression severity grading using shap-based insights.Journal of NeuroEngineering and Rehabilitation, 22(1):116, 2025

    Anruo Shen, Jingnan Sun, Xiaogang Chen, and Xiaorong Gao. A data-centric and interpretable eeg framework for depression severity grading using shap-based insights.Journal of NeuroEngineering and Rehabilitation, 22(1):116, 2025

  21. [21]

    Feature selection strategies: a comparative analysis of shap-value and importance-based methods.Journal of Big Data, 11(1):44, 2024

    Huanjing Wang, Qianxin Liang, John T Hancock, and Taghi M Khoshgoftaar. Feature selection strategies: a comparative analysis of shap-value and importance-based methods.Journal of Big Data, 11(1):44, 2024

  22. [22]

    Molecular classification of cancer: class discovery and class prediction by gene expression monitoring.science, 286(5439):531–537, 1999

    Todd R Golub, Donna K Slonim, Pablo Tamayo, Christine Huard, Michelle Gaasenbeek, Jill P Mesirov, Hilary Coller, Mignon L Loh, James R Downing, Mark A Caligiuri, et al. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring.science, 286(5439):531–537, 1999

  23. [23]

    Musk (Version 2)

    David Chapman and Ajay Jain. Musk (Version 2). UCI Machine Learning Repository, 1994

  24. [24]

    Avariento

    Joaquín Torres-Sospedra, Raúl Montoliu, Adolfo Martínez-Usó, Tomàs Arnau, and Joan P. Avariento. UJIIndoorLoc. UCI Machine Learning Repository, 2014. Accessed from the UCI Machine Learning Repository

  25. [25]

    Learning multiple layers of features from tiny images

    Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. 2009

  26. [26]

    Wiley, New York, 3 edition, 1995

    Patrick Billingsley.Probability and Measure. Wiley, New York, 3 edition, 1995. A Appendix / supplemental material A.1 Proof of Theorem Lemma 1(Lyapunov central limit theorem [ 26, Theorem 27.3]).Let Zn1, . . . , Znn be independent random variables withE[Z ni] = 0, variancesσ 2 ni, ands 2 n =Pn i=1 σ2 ni. If, for someδ >0, 1 s2+δn nX i=1 E|Zni|2+δ →0, then...