arxiv: 2604.26219 · v1 · submitted 2026-04-29 · 💻 cs.CR · cs.LG

Recognition: unknown

eDySec: A Deep Learning-based Explainable Dynamic Analysis Framework for Detecting Malicious Packages in PyPI Ecosystem

Sk Tanzir Mehedi , Raja Jurdak , Chadni Islam , Abu Bakar Siddique Mahi , Gowri Ramachandran

Authors on Pith no claims yet

Pith reviewed 2026-05-07 13:32 UTC · model grok-4.3

classification 💻 cs.CR cs.LG

keywords malicious package detectiondeep learningdynamic analysisPyPI ecosystemexplainable AIsoftware supply chainbehavioral featuresfalse positive reduction

0 comments

The pith

Deep learning on dynamic package behaviors detects malicious PyPI packages with half the features and 82 percent fewer false positives.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces eDySec, a framework that applies deep learning to dynamic behavioral data from PyPI packages, including system calls, network traffic, directory access, and dependency logs captured at install time and afterward. It argues that this handles the high-dimensional and sparse character of such data more effectively than traditional machine learning, while adding stability checks and explainable AI to make decisions reliable and transparent. Evaluation on the QUT-DV25 dataset shows the framework halves feature dimensionality, cuts false positives by 82 percent, false negatives by 79 percent, raises accuracy by 3 percent, reaches near-perfect stability, and runs at 170 milliseconds per package. A sympathetic reader would care because supply-chain attacks on open-source repositories are rising, and practical improvements in detection could reduce the chance that developers unknowingly incorporate malicious code.

Core claim

eDySec is a deep learning-based explainable dynamic analysis framework that outperforms state-of-the-art methods for detecting malicious PyPI packages. It achieves this by evaluating deep learning models on dynamic behavioral features from the QUT-DV25 dataset, selecting the most discriminative attributes, incorporating model stability analysis, and applying explainable AI techniques. The result is halved feature dimensionality, 82 percent lower false positives, 79 percent lower false negatives, 3 percent higher accuracy, near-perfect stability, and 170 ms inference latency per package, with the authors noting that poor feature or model choices can degrade performance.

What carries the argument

The eDySec pipeline, which combines deep learning models applied to selected dynamic behavioral features (install-time and post-installation) with stability analysis and explainable AI to produce efficient and interpretable detections.

If this is right

Fewer legitimate packages are incorrectly flagged, reducing unnecessary review burden for developers and repository maintainers.
Lower false negatives mean more actual malicious packages are caught before they reach users.
Explainable outputs allow security teams to understand and verify the reasons for each detection.
Reduced feature count and 170 ms latency make the approach suitable for integration into package installation workflows.
The finding that some model-feature combinations degrade performance highlights the need for careful selection in any deployed detector.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the dynamic-behavior approach generalizes, package managers could run similar lightweight scans automatically during installation.
The same feature-selection and stability methods might improve detection in other language ecosystems that face analogous supply-chain risks.
Combining the dynamic signals emphasized here with static code analysis could address attacks that only appear after installation.
The emphasis on model stability suggests the framework could support repeated scans over time without retraining drift.

Load-bearing premise

The performance gains measured on the QUT-DV25 dataset will generalize to the full range of real-world PyPI packages without the chosen models and features overfitting to the dataset's particular traits.

What would settle it

Testing eDySec on an independent collection of labeled malicious and benign PyPI packages gathered after the QUT-DV25 dataset would show whether the reported drops in false positives and negatives, accuracy gain, and stability persist.

Figures

Figures reproduced from arXiv: 2604.26219 by Abu Bakar Siddique Mahi, Chadni Islam, Gowri Ramachandran, Raja Jurdak, Sk Tanzir Mehedi.

**Figure 1.** Figure 1: Overall system architecture of the proposed eDySec view at source ↗

**Figure 2.** Figure 2: Proposed eDySec framework for detecting mali view at source ↗

**Figure 3.** Figure 3: Overview of the QUT-DV25 dataset: (a) statistics of view at source ↗

**Figure 4.** Figure 4: t-SNE visualization of 200 randomly selected sam view at source ↗

**Figure 5.** Figure 5: Performance comparison of feature selection meth view at source ↗

**Figure 6.** Figure 6: Performance of FLAML-based MLP model on the view at source ↗

**Figure 7.** Figure 7: Performance of FLAML-based MLP model on the view at source ↗

**Figure 8.** Figure 8: Comparison of (a) FPR and FNR across trace types view at source ↗

**Figure 10.** Figure 10: SHAP waterfall explanations for representative view at source ↗

**Figure 9.** Figure 9: Global SHAP summary of the most influential fea view at source ↗

**Figure 11.** Figure 11: Local explanations for representative samples. (a) view at source ↗

**Figure 1.** Figure 1: Performance comparison of different feature selection methods on the QUT-DV25 dataset using MLP model. view at source ↗

**Figure 2.** Figure 2: Performance of the FLAML-DL models on the QUT-DV25 Combined dataset: (a) accuracy and (b) loss. view at source ↗

**Figure 3.** Figure 3: Performance of the FLAML-DL models on the QUT-DV25 Combined dataset: (a) Confusion matrix and (b) ROC curve. view at source ↗

**Figure 4.** Figure 4: Performance of the FLAML-DL models on the Pattern trace dataset: (a) accuracy and (b) loss. view at source ↗

**Figure 5.** Figure 5: Performance of the FLAML-DL models on the Pattern trace dataset: (a) Confusion matrix and (b) ROC curve. view at source ↗

read the original abstract

The security of open-source software repositories is increasingly threatened by next-gen software supply chain attacks. These attacks include multiphase malware execution, remote access activation, and dynamic payload generation. Traditional Machine Learning (ML) detectors struggle to detect these attacks due to the high-dimensional and sparse nature of dynamic behavioral data, including system calls, network traffic, directory access patterns, and dependency logs. As a result, these data characteristics degrade the performance, stability, and explainability of ML models. These challenges have made Deep Learning (DL) a promising alternative, given its success across various domains and its potential for modeling complex patterns. This paper presents eDySec, a DL-based efficient, stable, and explainable framework for dynamic behavioral analysis to detect malicious packages. Using the QUT-DV25 dataset, which captures both install-time and post-installation behaviors of packages, we evaluate DL models and investigate feature sets to identify the most discriminative attributes for enabling efficient malicious package detection. Additionally, model stability analysis and explainable AI techniques are incorporated into the detection pipeline to enable stable, and transparent interpretations of model decisions. Experimental results demonstrate that eDySec significantly outperforms the state-of-the-art frameworks. Specifically, it halves feature dimensionality while lowering false positives by 82% and false negatives by 79%. It also improves accuracy by 3%, achieves near-perfect stability, and maintains an inference latency of 170ms per package. Further analysis reveals that feature and model selection play a critical role, as certain combinations degrade performance. Ultimately, this study advances the understanding of the strengths and limitations of dynamic analysis against next-gen attacks.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

eDySec applies deep learning plus stability and XAI to dynamic PyPI package detection and reports large error reductions on QUT-DV25, but the evaluation lacks the details needed to judge if those gains are reliable.

read the letter

eDySec is a framework that uses deep learning on dynamic analysis data to find malicious packages in the PyPI ecosystem, adding stability checks and explainable AI to make the results more reliable and interpretable. It claims to beat existing methods by a good margin on their test set. What stands out is the attempt to deal with the sparse, high-dimensional nature of behavioral data like system calls and dependency logs. The paper shows that picking the right features and models can lead to better accuracy and lower false rates, and the inclusion of stability analysis helps confirm the model does not fluctuate much. Using XAI for transparent decisions is a practical step for security applications where trust matters. The soft spots center on the evaluation. The big improvements in false positives and negatives, plus reduced feature count, are presented without details on dataset construction, label sources, or how the experiments were structured to avoid bias. This makes it hard to know if the gains would hold on a broader or different set of packages. The concern about QUT-DV25 not matching real-world distributions is fair, and more external testing would strengthen the case. Readers focused on software supply chain defenses or applying machine learning to security data would find this relevant. It gives a concrete example of integrating multiple techniques for one problem. The paper deserves a serious referee because it tackles an important and growing threat with a structured approach. The ideas are sound even if the supporting evidence needs bolstering. I recommend sending it to peer review so the authors can address the experimental transparency issues.

Referee Report

3 major / 1 minor

Summary. The paper presents eDySec, a deep learning-based explainable dynamic analysis framework for detecting malicious packages in the PyPI ecosystem. It evaluates DL models and feature sets on the QUT-DV25 dataset (capturing install-time and post-install behaviors), incorporates stability analysis and XAI techniques, and claims to outperform SOTA by halving feature dimensionality, reducing false positives by 82%, false negatives by 79%, improving accuracy by 3%, achieving near-perfect stability, and maintaining 170ms inference latency per package.

Significance. If the empirical results prove robust, this would represent a meaningful advance in software supply-chain security by tackling high-dimensional sparse dynamic data (system calls, network traffic, etc.) with efficient, stable, and interpretable DL models. The emphasis on next-generation attacks and the combination of performance, stability, and explainability could inform practical detectors for open-source repositories.

major comments (3)

[Experimental results] The experimental evaluation provides no details on QUT-DV25 dataset size, collection window, labeling source or process, or train/test split strategy. These omissions are load-bearing because the headline claims (82% FP reduction, 79% FN reduction, halving of dimensionality) cannot be assessed for generalizability or absence of collection artifacts without this information.
[Feature and model selection] It is not stated whether feature selection (the process that halves dimensionality) was performed inside or outside the cross-validation loop. If performed on the full dataset, the reported performance deltas and stability results are at risk of optimistic bias and may not reflect true out-of-sample behavior on the sparsity profile of QUT-DV25.
[Results and discussion] The state-of-the-art baselines used for comparison are not described in sufficient detail (implementation, hyper-parameters, or exact experimental conditions), preventing verification of the claimed 3% accuracy improvement and the 82%/79% FP/FN reductions.

minor comments (1)

[Abstract] The abstract contains a minor grammatical issue ('stable, and transparent' should read 'stable and transparent').

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment below and have revised the manuscript to enhance reproducibility and clarity.

read point-by-point responses

Referee: The experimental evaluation provides no details on QUT-DV25 dataset size, collection window, labeling source or process, or train/test split strategy. These omissions are load-bearing because the headline claims (82% FP reduction, 79% FN reduction, halving of dimensionality) cannot be assessed for generalizability or absence of collection artifacts without this information.

Authors: We agree that these details are necessary to evaluate generalizability and rule out artifacts. The revised manuscript adds a dedicated subsection in the Experimental Setup that reports the QUT-DV25 dataset size, collection window and methodology, labeling source and process, and the train/test split strategy (including stratification and ratio). revision: yes
Referee: It is not stated whether feature selection (the process that halves dimensionality) was performed inside or outside the cross-validation loop. If performed on the full dataset, the reported performance deltas and stability results are at risk of optimistic bias and may not reflect true out-of-sample behavior on the sparsity profile of QUT-DV25.

Authors: We appreciate the concern about potential leakage. Feature selection was performed inside the cross-validation loop on training folds only. The revised manuscript explicitly states this in the Feature Selection subsection, describes the method used, and reports the average dimensionality reduction observed across folds. revision: yes
Referee: The state-of-the-art baselines used for comparison are not described in sufficient detail (implementation, hyper-parameters, or exact experimental conditions), preventing verification of the claimed 3% accuracy improvement and the 82%/79% FP/FN reductions.

Authors: We agree that additional detail is required for verification. The revised manuscript expands the Baselines subsection to include implementation details (libraries and versions), hyper-parameter settings, and the exact experimental conditions (same splits and preprocessing) under which the baselines were run. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical performance claims rest on dataset evaluation

full rationale

The paper presents eDySec as a DL framework evaluated on the QUT-DV25 dataset for malicious package detection, reporting empirical metrics such as halved feature dimensionality, 82% lower false positives, 79% lower false negatives, 3% accuracy improvement, near-perfect stability, and 170ms latency. No mathematical derivation chain, equations, or self-referential definitions are present in the provided text. Performance claims are framed as experimental outcomes from model training and testing rather than predictions derived from fitted parameters or self-citations that reduce to the inputs by construction. The central results depend on external dataset evaluation and comparisons to SOTA, which are falsifiable and not tautological.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The work applies established deep learning and XAI methods to a security dataset without introducing new mathematical constructs or entities. The main dependencies are on the dataset and standard ML assumptions.

free parameters (2)

Deep learning model hyperparameters
Tuned to optimize detection performance on the QUT-DV25 dataset as part of the evaluation.
Feature selection criteria
Determines which behavioral attributes are most discriminative, leading to the reported dimensionality reduction.

axioms (1)

domain assumption High-dimensional sparse dynamic behavioral data from packages can be effectively modeled by deep learning for malicious detection.
This underpins the choice of DL over traditional ML as stated in the abstract.

pith-pipeline@v0.9.0 · 5620 in / 1424 out tokens · 109171 ms · 2026-05-07T13:32:37.334890+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

70 extracted references · 33 canonical work pages · 1 internal anchor

[1]

Hubert Baniecki and Przemyslaw Biecek. 2024. Adversarial Attacks and Defenses in Explainable Artificial Intelligence: A Survey.Information Fusion107 (2024), 102303. doi:10.1016/j.inffus.2024.102303

work page doi:10.1016/j.inffus.2024.102303 2024
[2]

Alessio Benavoli, Giorgio Corani, Janez Demšar, and Marco Zaffalon. 2017. Time for a Change: A Tutorial for Comparing Multiple Classifiers Through Bayesian Analysis.Journal of Machine Learning Research18, 77 (2017), 1–36

2017
[3]

2026.litellm

BerriAI. 2026.litellm. https://pypi.org/project/litellm/ Latest version 1.83.4; Accessed: Apr. 9, 2026

2026
[4]

2025.Attackers Adopt Novel LOTL Techniques to Evade De- tection

James Coker. 2025.Attackers Adopt Novel LOTL Techniques to Evade De- tection. https://www.infosecurity-magazine.com/news/attackers-novel-lotl- detection/ Accessed: Apr. 9, 2026

2025
[5]

n.d..Open Source Security

Cybersecurity and Infrastructure Security Agency. n.d..Open Source Security. https://www.cisa.gov/opensource Accessed: Apr. 9, 2026

2026
[6]

DataDog. 2023. GuardDog. https://github.com/DataDog/guarddog. CLI tool to identify malicious packages in PyPI and npm

2023
[7]

Janez Demšar. 2006. Statistical Comparisons of Classifiers over Multiple Data Sets.Journal of Machine Learning Research7 (2006), 1–30

2006
[8]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. InProceedings of the 2019 Conference of the North American Chapter of the Associ- ation for Computational Linguistics: Human Language Technologies (NAACL-HLT). Association for Computational Linguistics, ...

2019
[9]

Rotem Dror, Gili Baumer, Segev Shlomov, and Roi Reichart. 2018. The Hitchhiker’s Guide to Statistical Significance in Natural Language Processing. InProceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Melbourne, Australia, 1383–1392. doi:10.18653/v1/P18-1128

work page doi:10.18653/v1/p18-1128 2018
[10]

Jeffrey L. Elman. 1990. Finding Structure in Time.Cognitive Science14, 2 (1990), 179–211. doi:10.1207/s15516709cog1402_1

work page doi:10.1207/s15516709cog1402_1 1990
[11]

2024.Malware Package Analysis: aiocpa

Mike Fiedler. 2024.Malware Package Analysis: aiocpa. https://blog.pypi.org/ posts/2024-11-25-aiocpa-attack-analysis/ Accessed: Apr. 9, 2026

2024
[12]

Ronald A. Fisher. 1925.Statistical Methods for Research Workers. Oliver and Boyd, Edinburgh and London. https://onlinebooks.library.upenn.edu/webbin/book/ lookupid?key=olbp89164

1925
[13]

2025.Global Costs of Software Supply Chain Attacks On The Rise

Taylor Fox. 2025.Global Costs of Software Supply Chain Attacks On The Rise. https://cybersecurityventures.com/global-costs-of-software-supply-chain- attacks-on-the-rise/ Accessed: Apr. 9, 2026

2025
[14]

Mit Gandhi, Mehek Patel, Harsh Bhadra, and Shubham Verma. 2024. Automated Detection and Mitigation of Malicious Packages in the PyPI Ecosystem and .exe Files: PyGuardEX. In2024 First International Conference for Women in Computing (InCoWoCo). IEEE, Pune, Maharashtra, India, 1–9. doi:10.1109/InCoWoCo64194. 2024.10863220

work page doi:10.1109/incowoco64194 2024
[15]

Kai Gao, Weiwei Xu, Wenhao Yang, and Minghui Zhou. 2024. PyRadar: Towards Automatically Retrieving and Validating Source Code Repository Information for PyPI Packages.Proceedings of the ACM on Software Engineering1, FSE (2024), 2608–2631. doi:10.1145/3660822

work page doi:10.1145/3660822 2024
[16]

Xingan Gao, Xiaobing Sun, Sicong Cao, Kaifeng Huang, Di Wu, Xiaolei Liu, Xingwei Lin, and Yang Xiang. 2025. MalGuard: Towards Real-Time, Accurate, and Actionable Detection of Malicious Packages in PyPI Ecosystem. In34th USENIX Security Symposium (USENIX Security 25). USENIX Association, Seattle, WA, USA, 4741–4758. https://www.usenix.org/conference/usenix...

2025
[17]

Salvador García, Alberto Fernández, Julián Luengo, and Francisco Herrera
[18]

Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power

Advanced Nonparametric Tests for Multiple Comparisons in the De- sign of Experiments in Computational Intelligence and Data Mining: Exper- imental Analysis of Power.Information Sciences180, 10 (2010), 2044–2064. doi:10.1016/j.ins.2009.12.010

work page doi:10.1016/j.ins.2009.12.010 2010
[19]

Damien Garreau and Ulrike von Luxburg. 2020. Explaining the Explainer: A First Theoretical Analysis of LIME. InProceedings of the 23rd International Conference on Artificial Intelligence and Statistics. PMLR, Palermo, Sicily, Italy, 1287–1296

2020
[20]

2021.Securing the Open Source Supply Chain by Scanning for Package Registry Credentials

Annie Gesellchen. 2021.Securing the Open Source Supply Chain by Scanning for Package Registry Credentials. https://github.blog/security/supply-chain-security/ securing-open-source-supply-chain-scanning-package-registry-credentials/ Updated: Jul. 2, 2021; Accessed: Apr. 9, 2026

2021
[21]

2016.Deep Learning

Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016.Deep Learning. MIT Press, Cambridge, MA. http://www.deeplearningbook.org

2016
[22]

2026.Cutting the Gordian Knot: Detecting Mali- cious PyPI Packages via a Knowledge-Mining Framework

Wenbo Guo, Chengwei Liu, Ming Kang, Yiran Zhang, Jiahui Wu, Zhengzi Xu, Vinay Sachidananda, and Yang Liu. 2026.Cutting the Gordian Knot: Detecting Mali- cious PyPI Packages via a Knowledge-Mining Framework. arXiv:2601.16463 [cs.SE] doi:10.48550/arXiv.2601.16463

work page doi:10.48550/arxiv.2601.16463 2026
[23]

2024.PackageIntel: Leveraging Large Language Models for Automated Intelligence Extraction in Package Ecosystems

Wenbo Guo, Chengwei Liu, Limin Wang, Jiahui Wu, Zhengzi Xu, Cheng Huang, Yong Fang, and Yang Liu. 2024.PackageIntel: Leveraging Large Language Models for Automated Intelligence Extraction in Package Ecosystems. arXiv:2409.15049 [cs.SE] https://arxiv.org/abs/2409.15049

work page arXiv 2024
[24]

Wenbo Guo, Zhengzi Xu, Chengwei Liu, Cheng Huang, Yong Fang, and Yang Liu
[25]

InProceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering (ASE)

An Empirical Study of Malicious Code in PyPI Ecosystem. InProceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, Luxembourg City, Luxembourg, 166–177. doi:10.1109/ASE56229. 2023.00027

work page doi:10.1109/ase56229 2023
[26]

Isabelle Guyon and André Elisseeff. 2003. An Introduction to Variable and Feature Selection.Journal of Machine Learning Research3 (2003), 1157–1182. doi:10.1162/153244303322753616

work page doi:10.1162/153244303322753616 2003
[27]

Rafiqul Islam, Md

Sajal Halder, Michael Bewong, Arash Mahboubi, Yinhao Jiang, Md. Rafiqul Islam, Md. Zahidul Islam, Ryan H. L. Ip, Muhammad Ejaz Ahmed, Gowri Sankar Ra- machandran, and Muhammad Ali Babar. 2024. Malicious Package Detection using Metadata Information. InProceedings of the ACM Web Conference 2024 (WWW ’24). Association for Computing Machinery, New York, NY, U...

work page doi:10.1145/3589334.3645543 2024
[28]

Mark A. Hall. 1999.Correlation-Based Feature Selection for Machine Learning. Ph. D. Dissertation. The University of Waikato, Hamilton, New Zealand. https: //www.cs.waikato.ac.nz/ml/publications/1999/99MH-Thesis.pdf

1999
[29]

Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Computation9, 8 (1997), 1735–1780

1997
[30]

Cheng Huang, Nannan Wang, Ziyan Wang, Siqi Sun, Lingzi Li, Junren Chen, Qianchong Zhao, Jiaxuan Han, Zhen Yang, and Lei Shi. 2024. DONAPI: Malicious NPM Packages Detector using Behavior Sequence Knowledge Mapping. In33rd USENIX Security Symposium (USENIX Security 24). USENIX Association, Philadel- phia, PA, USA, 3765–3782. https://www.usenix.org/conferenc...

2024
[31]

Tahir Iqbal, Guowei Wu, Zahid Iqbal, and Muhammad Bilal Mahmood. 2026. CLAMPD-Net: Cross-Language Malicious Package Detection across PyPI and NPM with Multimodal Fusion.Information and Software Technology177 (2026), 108129. doi:10.1016/j.infsof.2026.108129

work page doi:10.1016/j.infsof.2026.108129 2026
[32]

Tahir Iqbal, Guowei Wu, Zahid Iqbal, Muhammad Bilal Mahmood, Amreen Shafique, and Wenbo Guo. 2025. PypiGuard: A Novel Meta-Learning Approach for Enhanced Malicious Package Detection in PyPI through Static-Dynamic Feature Fusion.Journal of Information Security and Applications90 (2025), 104032. doi:10.1016/j.jisa.2025.104032

work page doi:10.1016/j.jisa.2025.104032 2025
[33]

Raisul Islam, Mohammad Motiur Rahman, Sk

Md. Raisul Islam, Mohammad Motiur Rahman, Sk. Tanzir Mehedi, Abdullah Nazib, Rafiqul Islam, and Ziaur Rahman. 2023. A Modified Feature Selection Algorithm for Intrusion Detection System Based on Student Psychology-Based Optimization with Explainable AI. In2023 IEEE Asia-Pacific Conference on Computer Science and Data Engineering (CSDE). IEEE, Fiji, 1–6. d...

work page doi:10.1109/csde59766.2023.10487669 2023
[34]

Eberhart

James Kennedy and Russell C. Eberhart. 1995. Particle Swarm Optimization. In Proceedings of ICNN’95 — International Conference on Neural Networks, Vol. 4. IEEE, Perth, WA, Australia, 1942–1948. https://www.cs.tufts.edu/comp/150GA/ homeworks/hw3/_reading6%201995%20particle%20swarming.pdf

1995
[35]

Piergiorgio Ladisa, Serena Elisa Ponta, Nicola Ronzoni, Matias Martinez, and Olivier Barais. 2023. On the Feasibility of Cross-Language Detection of Malicious Packages in npm and PyPI. InProceedings of the 39th Annual Computer Security Applications Conference (ACSAC ’23). Association for Computing Machinery, New York, NY, USA, 71–82. doi:10.1145/3627106.3627138

work page doi:10.1145/3627106.3627138 2023
[36]

Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep Learning.Nature 521, 7553 (2015), 436–444

2015
[37]

Yann LeCun, Leon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient- Based Learning Applied to Document Recognition.Proc. IEEE86, 11 (1998), 2278–2324

1998
[38]

Lundberg and Su-In Lee

Scott M. Lundberg and Su-In Lee. 2017. A Unified Approach to In- terpreting Model Predictions. InAdvances in Neural Information Process- ing Systems 30, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fer- gus, S. Vishwanathan, and R. Garnett (Eds.). Curran Associates, Inc., Long Beach, CA, USA, 4765–4774. https://proceedings.neurips.cc/paper/2017/hash...

2017
[39]

Experimental Analysis of Trustworthy In- Vehicle Intrusion Detection System Using eXplainable Artificial Intel- ligence (XAI)

Harikha Manthena, Shaghayegh Shajarian, Jeffrey C. Kimmell, Mahmoud Ab- delsalam, Sajad Khorsandroo, and Maanak Gupta. 2025. Explainable Artificial Intelligence (XAI) for Malware Analysis: A Survey of Techniques, Applications, and Open Challenges.IEEE Access13 (2025), 61611–61640. doi:10.1109/ACCESS. 2025.3555926

work page doi:10.1109/access 2025
[40]

Tanzir Mehedi, Adnan Anwar, Ziaur Rahman, Kawsar Ahmed, and Rafiqul Islam

Sk. Tanzir Mehedi, Adnan Anwar, Ziaur Rahman, Kawsar Ahmed, and Rafiqul Islam. 2023. Dependable Intrusion Detection System for IoT: A Deep Transfer Learning Based Approach.IEEE Transactions on Industrial Informatics19, 1 (2023), 1006–1017. doi:10.1109/TII.2022.3164770

work page doi:10.1109/tii.2022.3164770 2023
[41]

Sk Tanzir Mehedi, Chadni Islam, Gowri Ramachandran, and Raja Jurdak. 2026. DySec: A Machine Learning-Based Dynamic Analysis for Detecting Malicious Packages in PyPI Ecosystem.IEEE Transactions on Information Forensics and Security21 (2026), 1316–1331. doi:10.1109/TIFS.2026.3654388

work page doi:10.1109/tifs.2026.3654388 2026
[42]

Sk Tanzir Mehedi, Raja Jurdak, Chadni Islam, and Gowri Ramachandran. 2025. QUT-DV25: A Dataset for Dynamic Analysis of Next-Gen Software Supply Chain Attacks. InAdvances in Neural Information Processing Systems 39. Curran As- sociates, Inc., San Diego, CA, USA. https://neurips.cc/virtual/2025/loc/san- diego/poster/121753 Datasets and Benchmarks Track

2025
[43]

2023.Cybersecurity Venture’s 2023 Software Supply Chain At- tack Report

Sydney Milligan. 2023.Cybersecurity Venture’s 2023 Software Supply Chain At- tack Report. https://snyk.io/blog/cybersecurity-ventures-2023-software-supply- chain-attack-report/ Accessed: Apr. 9, 2026. Conference Name, Month Year, City, Country Tanzir et al

2023
[44]

Seyedali Mirjalili and Andrew Lewis. 2016. The Whale Optimization Algorithm. Advances in Engineering Software95 (2016), 51–67. doi:10.1016/j.advengsoft.2016. 01.008

work page doi:10.1016/j.advengsoft.2016 2016
[45]

Marc Ohm, Henrik Plate, Arnold Sykosch, and Michael Meier. 2020. Backstabber’s Knife Collection: A Review of Open Source Software Supply Chain Attacks. InDetection of Intrusions and Malware, and Vulnerability Assessment (DIMV A) (Lecture Notes in Computer Science, Vol. 12223), Clémentine Maurice, Marc Heuse, Mauro Conti, and Gianluca Dini (Eds.). Springer...

2020
[46]

2025.Malicious PyPI Pack- ages Deliver SilentSync RAT

Manisha Ramcharan Prajapati and Satyam Singh. 2025.Malicious PyPI Pack- ages Deliver SilentSync RAT. https://www.zscaler.com/blogs/security-research/ malicious-pypi-packages-deliver-silentsync-rat Accessed: Apr. 9, 2026

2025
[47]

2026.EVNextTrade: Learning-to-Rank-Based Recommendation of Next Charging Nodes for EV-EV Energy Trading

Md Mahfujur Rahman, Alistair Barros, Raja Jurdak, and Darshika Koggalahewa. 2026.EVNextTrade: Learning-to-Rank-Based Recommendation of Next Charging Nodes for EV-EV Energy Trading. arXiv:2603.26688 [cs.IR] doi:10.48550/arXiv.2603. 26688

work page doi:10.48550/arxiv.2603 2026
[48]

Why Should I Trust You?

Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. InProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, San Francisco, CA, USA, 1135–1144. doi:10.1145/2939672.2939778

work page doi:10.1145/2939672.2939778 2016
[49]

Gaith Rjoub, Jamal Bentahar, Omar Abdel Wahab, Rabeb Mizouni, Alyssa Song, Robin Cohen, Hadi Otrok, and Azzam Mourad. 2023. A Survey on Explainable Artificial Intelligence for Cybersecurity.IEEE Transactions on Network and Service Management20, 4 (2023), 5115–5140

2023
[50]

Learning representations by back-propagating errors.Nature1986,323, 533–536

David E. Rumelhart, Geoffrey E. Hinton, and Ronald J. Williams. 1986. Learning Representations by Back-Propagating Errors.Nature323, 6088 (1986), 533–536. doi:10.1038/323533a0

work page doi:10.1038/323533a0 1986
[51]

Khuloud Saeed Alketbi and Abid Mehmood. 2025. A Comprehensive Survey of Explainable Artificial Intelligence Techniques for Malicious Insider Threat Detection.IEEE Access13 (2025), 121772–121798. doi:10.1109/ACCESS.2025. 3587114

work page doi:10.1109/access.2025 2025
[52]

2026.Fake Grok API Wrapper Deploys New Malware

Safety Research Team. 2026.Fake Grok API Wrapper Deploys New Malware. https://www.getsafety.com/blog-posts/grokwrapper Accessed: Apr. 9, 2026

2026
[53]

Haya Samaana, Diego Elias Costa, Emad Shihab, and Ahmad Abdellatif. 2025. A Machine Learning-Based Approach for Detecting Malicious PyPI Packages. InProceedings of the 40th ACM/SIGAPP Symposium on Applied Computing (SAC ’25). Association for Computing Machinery, New York, NY, USA, 1617–1626. doi:10.1145/3672608.3707756

work page doi:10.1145/3672608.3707756 2025
[54]

Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2019. DistilBERT, a Distilled Version of BERT: Smaller, Faster, Cheaper and Lighter. arXiv:1910.01108 [cs.CL] https://arxiv.org/abs/1910.01108

work page internal anchor Pith review arXiv 2019
[55]

2025.Malware Targeting Developers Reaches 845K Packages According to Sonatype Open Source Malware Index

Sonatype. 2025.Malware Targeting Developers Reaches 845K Packages According to Sonatype Open Source Malware Index. https://www.sonatype.com/press-releases/ q2-2025-open-source-malware-index Accessed: Apr. 18, 2026

2025
[56]

2026.Compromised litellm PyPI Package Deliv- ers Multi-Stage Credential Stealer

Sonatype Security Research Team. 2026.Compromised litellm PyPI Package Deliv- ers Multi-Stage Credential Stealer. https://www.sonatype.com/blog/compromised- litellm-pypi-package-delivers-multi-stage-credential-stealer Accessed: Apr. 9, 2026

2026
[57]

Xiaobing Sun, Xingan Gao, Sicong Cao, Lili Bo, Xiaoxue Wu, and Kaifeng Huang
[58]

InProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering (ASE ’24)

1+1 >2: Integrating Deep Code Behaviors with Metadata Features for Mali- cious PyPI Package Detection. InProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering (ASE ’24). Association for Com- puting Machinery, New York, NY, USA, 1159–1170. doi:10.1145/3691620.3695493

work page doi:10.1145/3691620.3695493
[59]

Zhuoran Tan, Wenbo Guo, Taylor Brierley, Jiewen Luo, Jeremy Singer, and Christos Anagnostopoulos. 2026. SynthChain: A Synthetic Benchmark and Forensic Analysis of Advanced and Stealthy Software Supply Chain Attacks. arXiv:2603.16694 [cs.CR] doi:10.48550/arXiv.2603.16694

work page doi:10.48550/arxiv.2603.16694 2026
[60]

Guy Van den Broeck, Anton Lykov, Maximilian Schleich, and Dan Suciu. 2022. On the Tractability of SHAP Explanations.Journal of Artificial Intelligence Research 74 (2022), 851–886

2022
[61]

Gomez, Łukasz Kaiser, and Illia Polosukhin

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention Is All You Need. InAdvances in Neural Information Processing Systems 30, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Curran Associates, Inc., Long Beach, CA, US...

2017
[62]

Duc-Ly Vu, Zachary Newman, and John Speed Meyers. 2023. Bad Snakes: Under- standing and Improving Python Package Index Malware Scanning. InProceedings of the 45th IEEE/ACM International Conference on Software Engineering. IEEE, Melbourne, Australia, 499–511. doi:10.1109/ICSE48619.2023.00052

work page doi:10.1109/icse48619.2023.00052 2023
[63]

Chi Wang, Qingyun Wu, Markus Weimer, and Erkang Zhu. 2021. FLAML: A Fast and Lightweight AutoML Library. InProceedings of Machine Learning and Systems (MLSys), Vol. 3. MLSys, Online, 434–447. https://proceedings.mlsys.org/paper_ files/paper/2021/hash/1ccc3bfa05cb37b917068778f3c4523a-Abstract.html

2021
[64]

Jiale Yan and Bo Zhao. 2026. PyPIMalDet: A Malicious PyPI Package Detection Method Combining Code Features and Metadata Features.Neural Networks197 (2026), 108487. doi:10.1016/j.neunet.2025.108487

work page doi:10.1016/j.neunet.2025.108487 2026
[65]

Nusrat Zahan, Philipp Burckhardt, Mikola Lysenko, Feross Aboukhadijeh, and Laurie Williams. 2024. MalwareBench: Malware Samples Are Not Enough. In Proceedings of the 21st IEEE/ACM International Conference on Mining Software Repositories. ACM, Lisbon, Portugal, 728–732. doi:10.1145/3643991.3644883

work page doi:10.1145/3643991.3644883 2024
[66]

Junan Zhang, Kaifeng Huang, Yiheng Huang, Bihuan Chen, Ruisi Wang, Chong Wang, and Xin Peng. 2025. Killing Two Birds with One Stone: Malicious Package Detection in NPM and PyPI using a Single Model of Malicious Behavior Sequence. ACM Transactions on Software Engineering and Methodology34, 4 (2025), 104:1– 104:28. doi:10.1145/3705304

work page doi:10.1145/3705304 2025
[67]

Xinyi Zheng, Chen Wei, Shenao Wang, Yanjie Zhao, Peiming Gao, Yuanchao Zhang, Kailong Wang, and Haoyu Wang. 2024. Towards Robust Detection of Open Source Software Supply Chain Poisoning Attacks in Industry Environments. In Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering (ASE ’24). Association for Computing Machi...

work page doi:10.1145/3691620.3695262 2024
[68]

An Analysis of Malicious Packages in Open- Source Software in the Wild

Xiaoyan Zhou, Ying Zhang, Wenjia Niu, Jiqiang Liu, Haining Wang, and Qiang Li. 2024. OSS-MPAW Artifact Repository. https://github.com/Rebecia/OSS- MPAW. Artifact repository for “An Analysis of Malicious Packages in Open- Source Software in the Wild”

2024
[69]

Markus Zimmermann, Cristian-Alexandru Staicu, Cam Tenny, and Michael Pradel
[70]

InProceedings of the 28th USENIX Security Symposium(Santa Clara, CA, USA)(SEC’19)

Smallworld with High Risks: A Study of Security Threats in the npm Ecosystem. InProceedings of the 28th USENIX Security Symposium(Santa Clara, CA, USA)(SEC’19). USENIX Association, USA, 995–1010. eDySec: A Deep Learning-based Explainable Dynamic Analysis Framework for Detecting Malicious Packages in PyPI Ecosystem Conference Name, Month Year, City, Countr...

work page doi:10.7910/dvn/lbmxjy