Recognition: no theorem link
XAI and Statistical Analysis for Reliable Intrusion Detection in the UAVIDS-2025 Dataset: From Tree to Hybrid and Tabular DNN Ensembles
Pith reviewed 2026-05-15 05:45 UTC · model grok-4.3
The pith
XGBoost with SHAP and statistical tests shows density support intersections cause misclassifications in Wormhole and Blackhole attacks on UAVIDS-2025
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
XGBoost achieves high detection accuracy on UAVIDS-2025, and SHAP values combined with Westfall-Young tests and Jensen-Shannon distances on kernel density estimates demonstrate that false predictions for Wormhole and Blackhole attacks arise from density support intersections that allow these attacks to mimic normal traffic distributions.
What carries the argument
Shapley Additive explanations (SHAP) for feature attribution together with Westfall-Young permutation tests and Jensen-Shannon distances applied to bandwidth-optimized kernel density estimates for quantifying distribution overlap.
If this is right
- Tree-ensemble models like XGBoost outperform hybrid and DNN approaches for this UAV intrusion detection task.
- SHAP reveals attack-specific feature manipulations that enable traffic mimicry by Wormhole and Blackhole attacks.
- Density support intersection is identified as the mechanism behind persistent misclassifications in two attack types.
- The statistical pipeline using permutation tests and Jensen-Shannon distances offers a repeatable method to diagnose overlap-driven errors in other intrusion datasets.
Where Pith is reading between the lines
- The same density-overlap diagnosis could be applied to other network security datasets where attacks are crafted to blend with normal traffic.
- Real-time UAV monitoring systems might track the SHAP-highlighted features to flag potential mimicry before full classification.
- Future UAVIDS collections could reduce misclassifications by generating attack samples with deliberately lower density support overlap.
Load-bearing premise
The UAVIDS-2025 dataset mirrors real-world UAV network traffic distributions and the statistical tests correctly attribute misclassifications to density overlaps instead of artifacts from modeling or preprocessing.
What would settle it
Repeating the kernel density estimation, violin plot comparison, and Westfall-Young permutation analysis on an independent UAV traffic trace that includes confirmed Wormhole and Blackhole attacks to check whether the same feature overlaps and error patterns reappear.
Figures
read the original abstract
During the last few years, the term Mechanistic Interpretability, a specific area, under the umbrella of explainable artificial intelligence (XAI), has been introduced, to explain the decisions made by complex machine learning (ML) models in critical systems like UAV intrusion detection systems (UAVIDS). In this paper, we apply best-practices for data pre-processing and examine a wide range of tree-ensembles, deep neural networks, hybrid stacking models and the latest ensemble neural networks to detect intrusions in UAV, with stratified 10-fold cross validation. With our top-performing model, XGBoost, we proceed to Shapley Additive explanations (SHAP), to analyze the global and local feature importances and understand which features, each attack targets, to mimic normal traffic and where the misclassifications occur. Furthermore a distribution analysis follows, by visually comparing violin plots and the curves of kernel density estimations. With the Westfall-Young permutation test for multiple comparisons, the Bandwidth optimization of the KDEs and the selection of Jensen-Shannon Distance for the test, we discover the true causes of false predictions, observed in Wormhole and Blackhole attacks in UAVIDS-2025. The findings provide robust, reliable and explainable models for UAV intrusion detection, along with statistical insights, which capture and clarify the masked nature of the attacks, regarding the challenge of Density Support Intersection, between these attacks, in this dataset.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper evaluates a range of tree-ensemble, DNN, hybrid stacking, and ensemble neural network models for intrusion detection on the UAVIDS-2025 dataset using stratified 10-fold cross-validation. XGBoost is identified as the top performer; SHAP is then applied to analyze global and local feature importances and locate misclassifications. A subsequent distribution analysis employs violin plots, optimized KDEs, Westfall-Young permutation testing, and Jensen-Shannon distance to attribute false predictions on Wormhole and Blackhole attacks to density support intersections in the feature space.
Significance. If the statistical attribution is shown to be robust to modeling and preprocessing choices, the work supplies both competitive detection models and concrete, falsifiable insights into why certain UAV attacks evade detection, which could guide feature engineering and model design in safety-critical IDS settings.
major comments (1)
- [Statistical analysis] Statistical analysis section (post-SHAP paragraph): the claim that Westfall-Young permutation testing plus Jensen-Shannon distance on bandwidth-optimized KDEs isolates density support intersection as the root cause of misclassifications for Wormhole and Blackhole attacks is load-bearing, yet the manuscript reports these tests only on the XGBoost pipeline. No ablation across the hybrid DNN or tabular ensembles also evaluated in the paper, nor across alternative normalizations, is provided to demonstrate invariance of the detected intersections and p-values to model inductive bias or preprocessing artifacts.
minor comments (2)
- [Abstract] Abstract: the phrase 'the latest ensemble neural networks' is undefined; the main text should list the exact architectures and hyperparameter ranges used for all families.
- [Methods] The manuscript should report the exact number of features retained after preprocessing and any feature-engineering steps that could affect the KDE support analysis.
Simulated Author's Rebuttal
We thank the referee for their thorough review and for highlighting the importance of verifying the robustness of our statistical attribution. We address the major comment below and will incorporate the requested extensions in the revised manuscript.
read point-by-point responses
-
Referee: [Statistical analysis] Statistical analysis section (post-SHAP paragraph): the claim that Westfall-Young permutation testing plus Jensen-Shannon distance on bandwidth-optimized KDEs isolates density support intersection as the root cause of misclassifications for Wormhole and Blackhole attacks is load-bearing, yet the manuscript reports these tests only on the XGBoost pipeline. No ablation across the hybrid DNN or tabular ensembles also evaluated in the paper, nor across alternative normalizations, is provided to demonstrate invariance of the detected intersections and p-values to model inductive bias or preprocessing artifacts.
Authors: We agree that the load-bearing nature of the statistical claim warrants explicit checks for invariance. In the revised version we will extend the Westfall-Young permutation testing, bandwidth-optimized KDE estimation, and Jensen-Shannon distance calculations to the hybrid stacking and tabular ensemble models already evaluated in the paper. These additional results will be presented alongside the original XGBoost findings to confirm that the density-support intersections for Wormhole and Blackhole attacks persist across model families. We will also report a brief sensitivity check under an alternative normalization scheme to address potential preprocessing artifacts. This extension directly strengthens the falsifiability of the mechanistic insight without altering the core conclusions. revision: yes
Circularity Check
No significant circularity in derivation chain
full rationale
The paper trains standard tree ensembles and DNNs on the UAVIDS-2025 dataset using stratified cross-validation, selects XGBoost as top performer, applies SHAP for post-hoc feature attribution on model outputs, and then conducts separate distribution analysis via violin plots, optimized KDEs, Westfall-Young permutation testing, and Jensen-Shannon distance on the resulting feature values. None of these steps define a quantity in terms of itself, rename a fitted parameter as a prediction, or rely on self-citation chains for load-bearing uniqueness claims. The statistical attribution of misclassifications to density support intersection is derived from external, falsifiable methods applied to the data and model outputs rather than reducing to the inputs by construction.
Axiom & Free-Parameter Ledger
free parameters (2)
- XGBoost and DNN hyperparameters
- KDE bandwidth
axioms (2)
- domain assumption UAVIDS-2025 contains representative samples of normal and attack traffic.
- domain assumption Jensen-Shannon distance on KDEs correctly measures overlap causing misclassifications.
Reference graph
Works this paper leans on
-
[1]
Machine Learning Based In- trusion Detection System
A. A. Halimaa, K. Sundarakantham, “Machine Learning Based In- trusion Detection System”, IEEE International Conference on Trends in Electronics and Informatics, October, 2019, pp. 916-920, doi: 10.1109/ICOEI.2019.8862784
-
[2]
A. Jamalipour, S. Murali, “A Taxonomy of Machine-Learning-Based Intrusion Detection Systems for the Internet of Things: A Survey”, IEEE Internet of Things Journal, vol. 9, no. 12, November, 2021, pp.9444–9466, doi: 10.1109/JIOT.2021.3126811
-
[3]
C. B. S ¸ahin, “Securing UA V Swarms with Vision Transform- ers: A Byzantine-Robust Federated Learning Framework for Cross- Modal Intrusion Detection”, Drones, vol.10, no.2, February, doi: 2026https://doi.org/10.3390/drones10020125
-
[4]
Better by default: Strong Pre-Tuned MLPs and Boosted Trees on Tabular Data
D. Holzm ¨uller, L. Grinsztajn, I. Steinwart, “Better by default: Strong Pre-Tuned MLPs and Boosted Trees on Tabular Data”, NeurIPS, 2024
work page 2024
-
[5]
E. Haque, K. Hasan, I. Ahmed, M. S. Alam and T. Islam, “Towards an Interpretable AI Framework for Advanced Classification of Unmanned Aerial Vehicles (UA Vs)”, IEEE 21st Consumer Communications and Networking Conference (CCNC), Las Vegas, NV , USA, 2024, pp. 644- 645, doi: 10.1109/CCNC51664.2024.10454862
-
[6]
I. Bibers, O. Arreche, W. Alayed, M. Abdallah, “Ensemble-IDS: An En- semble Learning Framework for Enhancing AI-Based Network Intrusion Detection Tasks”, Applied Sciences, vo. 15, no. 19, September, 2025, doi: https://doi.org/10.3390/app151910579
-
[7]
Exact and Approximate Stepdown Methods for Multiple Hypothesis Testing
J. Romano, M. Wolf, “Exact and Approximate Stepdown Methods for Multiple Hypothesis Testing”, SSRN Electronic Journal, February, 2005, doi: 10.2139/ssrn.563267
-
[8]
Interpretability in Intelligent Systems — A New Concept?
L. K. Hansen, L. Rieger, “Interpretability in Intelligent Systems — A New Concept?”, Springer, September, 2019, pp. 41–49, doi: https://doi.org/10.1007/978-3-030-28954-6
-
[9]
L. Lin, H. Ge, Y . Zhou, R. Shangguan, “UA V Airborne Network Intrusion Detection Method Based on Improved Stratified Sampling and Ensemble Learning”, Drones, vol. 9, no. 9, August, 2025, https://doi.org/10.3390/drones9090604
-
[10]
Probabilistic Machine Learning: Advanced Topics
K. P. Murphy, “Probabilistic Machine Learning: Advanced Topics”, Cambridge, Massachusetts, USA, MIT Press, 2023, pp.55-236
work page 2023
-
[11]
M. A. Hossain, W. Ishtiaq, and M. S. Islam, “A Comparative Analysis of Ensemble-Based Machine Learning Approaches With Explainable AI for Multi-Class Intrusion Detection in Drone Net- works”, Security and Privacy, vol. 9, no. 1, December, 2025, doi: https://doi.org/10.1002/spy2.70164
-
[12]
M. Islam, F. Ahmed, W. Ishtiaq, M. Hossai, M. Tarek, “Advanced explainable ensemble models for multi-class intrusion detection in heterogeneous drone and industrial networks”, Journal of Information Security, Springer, March, 2026, doi: https://doi.org/10.1186/s13635- 026-00234-w
-
[13]
AI-Enhanced Intrusion Detection for UA V Systems: A Taxonomy and Comparative Review
M. S. Islam, A. S. Mahmoud, T. R. Sheltami, “AI-Enhanced Intrusion Detection for UA V Systems: A Taxonomy and Comparative Review”, Drones, 2025, vol. 9, no. 10, doi: https://doi.org/10.3390/drones9100682
-
[14]
M. Sarhan, S. Layeghy, M. Portmann, “An Explainable Machine Learning-Based Network Intrusion Detection System for Enabling Generalisability in Securing IoT Networks”, arXiv, April, 2021, doi: https://doi.org/10.48550/arXiv.2104.07183
-
[15]
AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data
N. Erickson, J. Mueller, A. Shirkov, H. Zhang, P. Larroy, M. Li, A. Smola, “AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data”, arXiv, March, 2020, doi: https://doi.org/10.48550/arXiv.2003.06505
work page internal anchor Pith review doi:10.48550/arxiv.2003.06505 2020
-
[16]
N. Meinshausen, M. H. Maathuis, P. B ¨uhlmann, “Asymptotic optimality of the Westfall–Young permutation procedure for multiple testing under dependence”, The Annals of Statistics, vol. 39, no. 6, December, 2011, pp. 3369–3391, doi: 10.1214/11-AOS946
-
[17]
Resampling-Based Multiple Testing: Exam- ples and Methods for p-Value Adjustment
P. H. Westfall, S. S. Young, “Resampling-Based Multiple Testing: Exam- ples and Methods for p-Value Adjustment”, The Annals of Statistics, vol. 43, no. 2, June, 1994, pp. 347–348, doi: https://doi.org/10.2307/2348369
-
[18]
Q. Zeng, A. Bashir and F. Nait-Abdesselam, “UA VIDS-2025: A Bench- mark Dataset for Intrusion Detection in UA V Networks Using Machine Learning Techniques”, 2025 IEEE Conference on Communications and Network Security (CNS), Avignon, France, 2025, pp. 1-9, doi: 10.1109/CNS66487.2025.11194990
-
[19]
S. A. H. Mohsan, N. Q. H. Othman, Y . Li, M. H. Alsharif, and M. A. Khan, “Unmanned aerial vehicles (UA Vs): practical aspects, applications, open challenges, security issues, and future trends”, In- telligent Service Robotics, vol. 16, no. 1, 2023, pp. 109–137, doi: 10.1007/s11370–022-00452-4
-
[20]
S. Erg ¨un, “Explaining XgBoost Predictions with SHAP Value: A Com- prehensive Guide to Interpreting Decision Tree-Based Models”, New Trends in Computer Sciences, vol.1, no.1, April, 2023, pp. 19–31, doi: https://doi.org/10.3846/ntcs.2023.17901
-
[21]
S. Kim, S. A. Kim, G. Kim, E. Menadjiev, C. Lee, S. Chung, N. Kim, J. Choi, “PnPXAI: A Universal XAI Framework Providing Automatic Explanations Across Diverse Modalities and Models”, arXiv, May, 2025, doi: https://doi.org/10.48550/arXiv.2505.10515
-
[22]
A Unified Approach to Interpreting Model Predictions
S. M. Lundberg, S. L. Lee, “A Unified Approach to Interpreting Model Predictions”, NeurIPS, December, 2017, pp. 4765–4774, doi: https://doi.org/10.48550/arXiv.1705.07874
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1705.07874 2017
-
[23]
S. Neupane, J. Ables, W. Anderson, S. Mittal, S. Rahimi, I. Banicescu and M. Seale, “Explainable Intrusion Detection Sys- tems (X-IDS): A Survey of Current Methods, Challenges, and Opportunities”, IEEE Access, vol.10, no.7, January, 2022, doi: https://doi.org/10.48550/arXiv.2207.06236
-
[24]
Deep Learning Based Intrusion Detection for Cybersecurity in Unmanned Aerial Vehicles Network
S. Niyonsaba, K. Konate and M. M. Soidridine, “Deep Learning Based Intrusion Detection for Cybersecurity in Unmanned Aerial Vehicles Network”, International Conference on Electrical, Computer and En- ergy Technologies (ICECET), Sydney, Australia, 2024, pp. 1-6, doi: 10.1109/ICECET61485.2024.10698453
-
[25]
From Gaussian kernel den- sity estimation to kernel methods
S. Wang, Z. Deng, FL. Chung, W. Hu, “From Gaussian kernel den- sity estimation to kernel methods”. International Journal of Machine Learning and Cybernetics, vol. 4, no. 2, April, 2013, pp.119-37, doi: https://doi.org/10.1007/s13042-012-0078-8
-
[26]
Agentic Explainable Artificial Intelligence (Agentic XAI) Approach To Explore Better Explanation
T. Yamaguchi, Y . Zhou, M. Ryo, K. Katsura, “Agentic Explainable Artificial Intelligence (Agentic XAI) Approach To Explore Better Explanation”, arXiv, December, 2025, doi: https://doi.org/10.48550/arXiv.2512.21066
-
[27]
One explanation Does Not Ft All: A Toolkit and Taxonomy of AI Explainability Techniques
V . Arya, R. K. E. Bellamy, P.-Y . Chen, A. Dhurandhar, M. Hind,S. C. Hoffman, S. Houde, Q. V . Liao, R. Luss, A. Mojsilovi ´c, S. Mourad, P. Pedemonte, R. Raghavendra, J. Richards, P. Sattigeri, K. Shanmugam, M. Singh, K. R. Varshney, D. Wei, Y . Zhang, “One explanation Does Not Ft All: A Toolkit and Taxonomy of AI Explainability Techniques” arXiv, Septe...
-
[28]
V . U. Ihekoronye, S. O. Ajakwe, J. M. Lee and D.-S. Kim, “DroneGuard: An Explainable and Efficient Machine Learning Framework for Intrusion Detection in Drone Networks”, IEEE Internet of Things Journal, vol. 12, no. 7, pp. 7708-7722, April, 2025, doi: 10.1109/JIOT.2024.3519633
-
[29]
Y .-W. Hong, D.-Y . Yoo, “Multiple Intrusion Detection Using Shapley Additive Explanations and a Heterogeneous Ensemble Model in an Un- manned Aerial Vehicle’s Controller Area Network”, Applied Sciences, vol. 14, no. 13, June, 2024, doi: https://doi.org/10.3390/app14135487
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.