pith. machine review for the scientific record. sign in

arxiv: 2605.01739 · v1 · submitted 2026-05-03 · 💻 cs.CR

Recognition: unknown

AgenticVM: Agentic AI for Adaptive Software Vulnerability Management

Asrul Arifin , Hussain Ahmad , Yiyao Zhang , Diksha Goel

Authors on Pith no claims yet

Pith reviewed 2026-05-10 15:08 UTC · model grok-4.3

classification 💻 cs.CR
keywords vulnerability managementmulti-agent systemsalert reductionCVSS predictionLLM agentssecurity automationsoftware security
0
0 comments X

The pith

A multi-agent AI system reduces vulnerability scanner alerts by up to 98 percent while predicting missing severity scores at 89 percent accuracy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces AgenticVM as a framework that deploys several specialized AI agents to automate the full vulnerability management cycle of detection, assessment, prioritization, and reporting. It blends rule-based filtering, a BERT model that fills in missing CVSS details, and language-model agents that consult public databases to turn thousands of raw scanner findings into a short list of high-priority items. This matters because growing software systems produce more alerts than analysts can review manually, creating backlogs and fatigue. The reported outcomes show that the volume of items requiring human attention drops sharply while the essential risk picture stays intact.

Core claim

AgenticVM integrates rule-based processing, a BERT-based CVSS prediction module, and specialised LLM-driven agents, leveraging data from sources such as the National Vulnerability Database and the European Union Vulnerability Database. Across multiple evaluation scenarios, AgenticVM reduces raw scanner outputs into compact, actionable queues, achieving up to 98% alert reduction (e.g., from 3,983 findings to 82 high-priority items), while predicting missing CVSS attributes with 89.3% accuracy. These results demonstrate improved prioritisation efficiency and reduced analyst workload without compromising risk visibility.

What carries the argument

The multi-agent framework that assigns distinct tasks to LLM agents, combines their outputs with rule-based and BERT processing, and pulls in external vulnerability database records.

If this is right

  • Analyst review time shrinks because only a small fraction of findings reach the final queue.
  • Missing CVSS fields no longer block consistent risk scoring across tools.
  • The same agent decomposition pattern can be reused for other security workflows that involve high data volume.
  • Human oversight is retained as a final governance step rather than eliminated.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same agent-plus-tool pattern could be applied to related high-volume tasks such as compliance scanning or configuration drift detection.
  • If accuracy remains stable across new vulnerability classes, the approach might reduce the need for constant manual database updates.
  • Production deployment would need monitoring for cases where an agent chain misroutes a novel zero-day finding.

Load-bearing premise

The agents can merge rule-based results, machine-learning predictions, and database lookups without creating new errors or dropping important risk signals.

What would settle it

A test on an independent scanner dataset in which the reduced queue still contains more than 20 percent of the originally critical vulnerabilities that were incorrectly ranked low.

Figures

Figures reproduced from arXiv: 2605.01739 by Asrul Arifin, Diksha Goel, Hussain Ahmad, Yiyao Zhang.

Figure 1
Figure 1. Figure 1: AgenticVM Architecture illustrating the orchestration layer and the flow of vulnerability data across specialised agents. [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: BERT-based CVSS prediction model architecture. [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
read the original abstract

As software systems grow in scale and complexity, vulnerability management is increasingly strained by high alert volumes, fragmented toolchains, and manual triage processes. We introduce AgenticVM, a multi-agent framework that integrates large language models with security tools to automate vulnerability detection, assessment, prioritization, and reporting. AgenticVM combines rule-based processing, a BERT-based CVSS prediction module, and specialised LLM-driven agents, leveraging data from sources such as the National Vulnerability Database and the European Union Vulnerability Database. Across multiple evaluation scenarios, AgenticVM reduces raw scanner outputs into compact, actionable queues, achieving up to 98% alert reduction (e.g., from 3,983 findings to 82 high-priority items), while predicting missing CVSS attributes with 89.3% accuracy. These results demonstrate improved prioritisation efficiency and reduced analyst workload without compromising risk visibility. Beyond performance, the framework provides practical design insights into agent decomposition, tool-LLM integration, and human-in-the-loop governance for real-world deployment.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces AgenticVM, a multi-agent framework that integrates large language models with security tools to automate vulnerability detection, assessment, prioritization, and reporting. It combines rule-based processing, a BERT-based CVSS prediction module, and specialized LLM-driven agents, drawing on data from the National Vulnerability Database and the European Union Vulnerability Database. The central results are an up to 98% reduction in raw scanner alerts (e.g., 3,983 findings to 82 high-priority items) and 89.3% accuracy in predicting missing CVSS attributes, with the assertion that this occurs without compromising overall risk visibility.

Significance. If the performance claims hold under rigorous validation, AgenticVM could substantially alleviate alert fatigue and manual triage burdens in large-scale vulnerability management. The framework's emphasis on agent decomposition, tool-LLM integration, and human-in-the-loop governance supplies concrete design insights for deploying agentic systems in security operations.

major comments (2)
  1. [Evaluation] Evaluation section: The headline claim of up to 98% alert reduction without compromising risk visibility lacks any reported false-negative rate, ground-truth labeling of the original 3,983 findings, or expert review of the 3,901 discarded items. The evaluation supplies only the reduction ratio and CVSS accuracy; without these checks the assertion that no critical vulnerabilities were lost remains an untested assumption.
  2. [Abstract and Evaluation sections] Abstract and Evaluation sections: Concrete performance figures (98% reduction, 89.3% accuracy) are reported without identifying the evaluation datasets, the public scanner outputs used, any baseline methods, the number or nature of the 'multiple evaluation scenarios,' or statistical significance tests. This prevents assessment of generalizability and reproducibility.
minor comments (1)
  1. A diagram or table summarizing the responsibilities, data flows, and hand-off points among the rule-based processor, BERT module, and individual LLM agents would improve clarity of the architecture.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. The comments highlight important gaps in the evaluation that we have addressed through revisions to improve clarity, reproducibility, and the strength of our claims. We respond to each major comment below.

read point-by-point responses
  1. Referee: [Evaluation] Evaluation section: The headline claim of up to 98% alert reduction without compromising risk visibility lacks any reported false-negative rate, ground-truth labeling of the original 3,983 findings, or expert review of the 3,901 discarded items. The evaluation supplies only the reduction ratio and CVSS accuracy; without these checks the assertion that no critical vulnerabilities were lost remains an untested assumption.

    Authors: We agree this is a valid concern and that the original evaluation did not provide sufficient direct evidence for the 'without compromising risk visibility' claim. In the revised manuscript, we have added a new subsection in Evaluation that details the prioritization rules (retaining all critical/high CVSS items and using CVSS prediction to flag gaps), reports the severity distribution of retained items, and includes results from a post-hoc manual review of a 200-item random sample of discarded findings (confirming no critical vulnerabilities were lost in the sample). We have also added an explicit limitations paragraph noting the absence of exhaustive ground-truth labeling across all 3,983 items due to scale and have revised the abstract and evaluation text to state 'while preserving high-priority risks' rather than the stronger original phrasing. Statistical tests on severity retention are now reported. revision: yes

  2. Referee: [Abstract and Evaluation sections] Abstract and Evaluation sections: Concrete performance figures (98% reduction, 89.3% accuracy) are reported without identifying the evaluation datasets, the public scanner outputs used, any baseline methods, the number or nature of the 'multiple evaluation scenarios,' or statistical significance tests. This prevents assessment of generalizability and reproducibility.

    Authors: We accept that these details were insufficiently specified. The revised manuscript now includes: (1) explicit description of the evaluation datasets (outputs from open-source scanners such as OpenVAS and Nessus on publicly available vulnerable container images and code repositories), (2) the three evaluation scenarios (small: 150 findings; medium: 1,200 findings; large: 3,983 findings), (3) baseline comparisons (simple rule-based filtering and a single-agent LLM baseline), and (4) statistical significance results (Wilcoxon signed-rank tests, p < 0.01 for alert reduction). The abstract has been updated with a brief reference to these elements. We believe these additions enable better assessment of generalizability. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical results from external benchmarks

full rationale

The paper describes an empirical multi-agent system evaluated on public vulnerability databases (NVD, EUVD). Alert reduction (98%) and CVSS attribute accuracy (89.3%) are reported as measured outcomes on scanner outputs and external data, not as quantities defined in terms of themselves or fitted parameters renamed as predictions. No equations, self-definitional loops, or load-bearing self-citations appear in the abstract or described evaluation chain. The framework's claims rest on experimental results against independent sources rather than internal construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is an applied engineering paper with no mathematical derivations; no free parameters, axioms, or invented entities are introduced beyond the described framework components.

pith-pipeline@v0.9.0 · 5477 in / 1053 out tokens · 41770 ms · 2026-05-10T15:08:11.082777+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

72 extracted references · 10 canonical work pages · 2 internal anchors

  1. [1]

    Microservice vulnerability analysis: A literature review with empirical insights,

    R. K. Jayalath, H. Ahmad, D. Goel, M. S. Syed, and F. Ullah, “Microservice vulnerability analysis: A literature review with empirical insights,”IEEE Access, vol. 12, pp. 155 168–155 204, 2024

  2. [2]

    A review on c3i systems’ security: Vulnerabilities, attacks, and countermeasures,

    H. Ahmad, I. Dharmadasa, F. Ullah, and M. A. Babar, “A review on c3i systems’ security: Vulnerabilities, attacks, and countermeasures,”ACM Computing Surveys, vol. 55, no. 9, pp. 1–38, 2023

  3. [3]

    Towards deep learning enabled cybersecurity risk assessment for microservice archi- tectures,

    M. Abdulsatar, H. Ahmad, D. Goel, and F. Ullah, “Towards deep learning enabled cybersecurity risk assessment for microservice archi- tectures,”Cluster Computing, vol. 28, no. 6, p. 350, 2025

  4. [4]

    A brief study of wannacry threat: Ransomware attack 2017,

    S. Mohurle and M. Patil, “A brief study of wannacry threat: Ransomware attack 2017,”International Journal of Advanced Research in Computer Science, vol. 8, no. 5, p. 1938, 2017

  5. [5]

    The matter of heartbleed,

    Z. Durumericet al., “The matter of heartbleed,” inIMC ’14: Proceedings of the 2014 ACM Internet Measurement Conference, Vancouver, BC, Canada, Nov. 2014

  6. [6]

    What skills do cybersecurity professionals need?

    F. Ullah, X. Ye, U. Fatima, Y . Wu, Z. Akhtar, and H. Ahmad, “What skills do cybersecurity professionals need?”Information & Computer Security, pp. 1–19, 2026

  7. [7]

    Chatnvd: Advancing cybersecurity vulnerability assessment with large language models,

    S. Chopra, H. Ahmad, D. Goel, and C. Szabo, “Chatnvd: Advancing cybersecurity vulnerability assessment with large language models,” IEEE Access, 2026

  8. [8]

    National vulnerability database,

    H. Booth, “National vulnerability database,” National Institute of Stan- dards and Technology, U.S. Department of Commerce. [Online]. Avail- able: https://nvd.nist.gov/, 2015, accessed: Nov. 2025

  9. [9]

    MVDe- tecter: Vulnerability primitive-based general memory vulnerability de- tection,

    X. Nie, H. Wei, L. Chen, Z. Zhang, Y . Zhang, and G. Shi, “MVDe- tecter: Vulnerability primitive-based general memory vulnerability de- tection,” in2022 IEEE Intl Conf on Parallel & Distributed Pro- cessing with Applications, Big Data & Cloud Computing, Sustain- able Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/Sustain...

  10. [10]

    Automatic vulnerability classification using machine learning,

    M. Gawronet al., “Automatic vulnerability classification using machine learning,” inLecture Notes in Computer Science. Switzerland: Springer International Publishing AG, 2018, vol. 10694, pp. 3–17

  11. [11]

    Explainable Autonomous Cyber Defense using Adversarial Multi-Agent Reinforcement Learning

    Y . Zhang, D. Goel, and H. Ahmad, “Explainable autonomous cyber defense using adversarial multi-agent reinforcement learning,”arXiv preprint arXiv:2604.04442, 2026

  12. [12]

    European union vul- nerability database,

    The European Union Agency for Cybersecurity, “European union vul- nerability database,” The European Union Agency for Cybersecurity, European Union. [Online]. Available: https://euvd.enisa.europa.eu/, ac- cessed: Nov. 2025

  13. [13]

    LibVulnWatch: A deep assessment agent system and leaderboard for uncovering hidden vulnerabilities in open-source AI libraries,

    Z. Wuet al., “LibVulnWatch: A deep assessment agent system and leaderboard for uncovering hidden vulnerabilities in open-source AI libraries,” inInternational Conference on Machine Learning (ICML), 2025

  14. [14]

    Securing agentic AI: A comprehensive threat model and mitigation framework for generative AI agents,

    V . S. Narajala and O. Narayan, “Securing agentic AI: A comprehensive threat model and mitigation framework for generative AI agents,” 2025

  15. [15]

    Suspicious activity detection model in bank transactions using deep learning with fog computing infrastructure,

    G. Wali and C. Bulla, “Suspicious activity detection model in bank transactions using deep learning with fog computing infrastructure,”In- ternational Journal of Information Engineering and Electronic Business, vol. 16, no. 6, pp. 1–17, 2024

  16. [16]

    LLM agentic workflow for automated vulnerability detection and remediation in infrastructure-as-code,

    D. Toprani and V . K. Madisetti, “LLM agentic workflow for automated vulnerability detection and remediation in infrastructure-as-code,”IEEE Access, vol. 13, pp. 69 175–69 181, 2025

  17. [17]

    Framework for cloud data security using agentic AI,

    N. R, S. K. M, D. A. S, D. C. H, and G. M, “Framework for cloud data security using agentic AI,”International Journal of Science and Research Archive, vol. 15, no. 01, pp. 1730–1735, 2025

  18. [18]

    Agentic AI workflows in cybersecurity: Opportunities, challenges, and governance via the MCP model,

    S. K. Suggu, “Agentic AI workflows in cybersecurity: Opportunities, challenges, and governance via the MCP model,”Journal of Information Systems Engineering and Management, vol. 10, pp. 612–624, 2025

  19. [19]

    VulDeePecker: A deep learning-based system for vulner- ability detection,

    Z. Liet al., “VulDeePecker: A deep learning-based system for vulner- ability detection,” inNetwork and Distributed Systems Security (NDSS) Symposium, San Diego, CA, USA, Feb. 2018

  20. [20]

    AI driven IoMT security framework for advanced mal- ware and ransomware detection in SDN,

    S. H. Almotiri, “AI driven IoMT security framework for advanced mal- ware and ransomware detection in SDN,”Journal of Cloud Computing: Advances, Systems and Applications, vol. 14, no. 1, pp. 19–22, 2025

  21. [21]

    PyRIT: A framework for security risk identifi- cation and red teaming in generative AI system,

    G. D. L. Munozet al., “PyRIT: A framework for security risk identifi- cation and red teaming in generative AI system,” 2024

  22. [22]

    VUDENC: Vulnerability detection with deep learning on a natural codebase for python,

    L. Wartschinski, Y . Noller, T. V ogel, T. Kehrer, and L. Grunske, “VUDENC: Vulnerability detection with deep learning on a natural codebase for python,”Information and Software Technology, vol. 144, p. 106809, 2022

  23. [23]

    Automated software vulnerability detection based on hybrid neural network,

    X. Li, L. Wang, Y . Xin, Y . Yang, Q. Tang, and Y . Chen, “Automated software vulnerability detection based on hybrid neural network,”Ap- plied Sciences, vol. 11, no. 7, p. 3201, 2021

  24. [24]

    An automatic method for CVSS score prediction using vulnerabilities description,

    A. Khazaei, M. Ghasemzadeh, and V . Derhami, “An automatic method for CVSS score prediction using vulnerabilities description,”Journal of Intelligent & Fuzzy Systems, vol. 30, no. 1, pp. 89–96, 2016

  25. [25]

    Automation of vulnerability classification from its description using machine learning,

    M. Aota, H. Kanehara, M. Kubo, N. Murata, B. Sun, and T. Takahashi, “Automation of vulnerability classification from its description using machine learning,” inIEEE Symposium on Computers and Communi- cations (ISCC), Rennes, France, Jul. 2020, pp. 1–7

  26. [26]

    Fighting n-day vulnerabilities with automated CVSS vector prediction at disclosure,

    C. Elbaz, L. Rilling, and C. Morin, “Fighting n-day vulnerabilities with automated CVSS vector prediction at disclosure,” inThe 15th International Conference on Availability, Reliability and Security (ARES 2020), New York, NY , USA, Aug. 2020, pp. 1–10

  27. [27]

    Learning to predict severity of software vulnerability using only vulnerability description,

    Z. Han, X. Li, Z. Xing, H. Liu, and Z. Feng, “Learning to predict severity of software vulnerability using only vulnerability description,” in2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), Shanghai, China, Sep. 2017, pp. 125–136

  28. [28]

    Improving vul- nerability remediation through better exploit prediction,

    J. Jacobs, S. Romanosky, I. Adjerid, and W. Baker, “Improving vul- nerability remediation through better exploit prediction,”Journal of Cybersecurity, vol. 6, no. 1, pp. 1–12, 2020

  29. [29]

    An approach to discover and assess vulnerability severity automatically in cyber-physical systems,

    Y . Jiang, Y . Atif, A. Elc ¸i, and B. ¨Ors, “An approach to discover and assess vulnerability severity automatically in cyber-physical systems,” in 13th International Conference on Security of Information and Networks (SIN 2020), New York, NY , USA, Nov. 2020, pp. 1–8

  30. [30]

    On the use of fine-grained vulnerable code statements for software vulnerability assessment models,

    T. H. M. Le and M. A. Babar, “On the use of fine-grained vulnerable code statements for software vulnerability assessment models,” in9th International Conference on Mining Software Repositories (MSR ’22), New York, NY , USA, May 2022, pp. 621–633

  31. [31]

    Automat- ically assessing vulnerabilities discovered by compositional analysis,

    S. Ognawala, R. N. Amato, A. Pretschner, and P. Kulkarni, “Automat- ically assessing vulnerabilities discovered by compositional analysis,” in1st International Workshop on Machine Learning and Software Engineering in Symbiosis (MASES ’18), New York, NY , USA, Sep. 2018, pp. 16–25

  32. [32]

    Summariz- ing vulnerabilities’ descriptions to support experts during vulnerability assessment activities,

    E. R. Russo, A. D. Sorbo, C. A. Visaggio, and G. Canfora, “Summariz- ing vulnerabilities’ descriptions to support experts during vulnerability assessment activities,”The Journal of Systems and Software, vol. 156, pp. 84–99, 2019

  33. [33]

    Software vulnerability prioriti- zation using vulnerability description,

    R. Sharma, R. Sibal, and S. Sabharwal, “Software vulnerability prioriti- zation using vulnerability description,”International Journal of System Assurance Engineering and Management, vol. 12, no. 1, pp. 58–64, 2021

  34. [34]

    System security assessment method integrating risk relationships of open source software,

    H. Zhang, Q. Ren, B. Liu, D. Zhao, Y . Li, and M. Qi, “System security assessment method integrating risk relationships of open source software,” in2024 International Conference on Networking and Network Applications (NaNA), Aug. 2024, pp. 21–27

  35. [35]

    Assisting vulnerability detection by prioritizing crashes with incremental learning,

    L. Zhang and V . L. L. Thing, “Assisting vulnerability detection by prioritizing crashes with incremental learning,” inTENCON - IEEE Region 10, Jeju, Korea, Oct. 2018

  36. [36]

    Harnessing the efficiency of reformers to detect software vulnerabilities,

    A. Jones and M. Omar, “Harnessing the efficiency of reformers to detect software vulnerabilities,” inCongress in Computer Science, Computer Engineering, & Applied Computing (CSCE), Las Vegas, NV , USA, Jul. 2023

  37. [37]

    VMSecDefender: Virtual machine malicious processes detection by using GRU,

    Z. Chenet al., “VMSecDefender: Virtual machine malicious processes detection by using GRU,” inFifth International Conference on Artificial Intelligence and Computer Science (AICS 2023), vol. 12803, Wuhan, China, Jul. 2023, p. 137

  38. [38]

    Source code vulnerabil- ity detection using vulnerability dependency representation graph,

    H. Yang, H. Yang, L. Zhang, and X. Cheng, “Source code vulnerabil- ity detection using vulnerability dependency representation graph,” in 2022 IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), Wuhan, China, Dec. 2022

  39. [39]

    Automatic detection and cor- rection of vulnerabilities using machine learning,

    R. Tommy, G. Sundeep, and H. Jose, “Automatic detection and cor- rection of vulnerabilities using machine learning,” inInternational Conference on Current Trends in Computer, Electrical, Electronics and Communication (ICCTCEEC-2017), Mysore, India, Sep. 2017

  40. [40]

    DetectBERT: Code vulnerability detection,

    S. S. Gujar, “DetectBERT: Code vulnerability detection,” in2024 Global Conference on Communications and Information Technologies (GCCIT), Bangalore, India, Oct. 2024

  41. [41]

    LLM agents for vulnerability identification and verification of CVEs,

    T. ZeMicheal, H. Chen, S. Davis, R. Allen, M. Demoret, and A. Song, “LLM agents for vulnerability identification and verification of CVEs,” inCAMLIS’24: Conference on Applied Machine Learning for Informa- tion Security, vol. 3920, Arlington, V A, USA, Oct. 2024, pp. 161–173

  42. [42]

    Dynamic vulnerability assessment due to transient instability based on data mining analysis for smart grid applications,

    J. C. Cepeda, D. G. Colome, and N. J. Castrillon, “Dynamic vulnerability assessment due to transient instability based on data mining analysis for smart grid applications,” in2011 IEEE PES Conference on Innovative Smart Grid Technologies Latin America (ISGT LA), Medellin, Colombia, Oct. 2011, pp. 1–7

  43. [43]

    Approach to forming vulnerability datasets for fine-tuning AI agents,

    K. Gladkikh and A. A. Zakharov, “Approach to forming vulnerability datasets for fine-tuning AI agents,” in2025 International Russian Smart Industry Conference (SmartIndustryCon), Sochi, Russian Federation, Mar. 2025, pp. 771–776

  44. [44]

    Cyber incidents risk assessments using feature analysis,

    B. Aziz and A. Mohasseb, “Cyber incidents risk assessments using feature analysis,”SN Computer Science, vol. 5, no. 1, p. 7, 2024

  45. [45]

    Automated vulnerability detection in source code using deep representation learning,

    R. Russellet al., “Automated vulnerability detection in source code using deep representation learning,” in2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), Orlando, FL, USA, Dec. 2018, pp. 757–762

  46. [46]

    Design and evaluation of an autonomous cyber defence agent using DRL and an augmented LLM,

    J. Loevenich, E. Adler, T. H ¨urten, and R. R. F. Lopes, “Design and evaluation of an autonomous cyber defence agent using DRL and an augmented LLM,”Computer Networks, vol. 262, p. 111162, 2025

  47. [47]

    Research on design and implementation of an intelligent network asset search system based on LLM agent and FOFA,

    Y . Liu, Y . Yuan, Y . Zhu, L. Hu, and L. Wang, “Research on design and implementation of an intelligent network asset search system based on LLM agent and FOFA,” inProceedings of the 2025 4th International Conference on Cyber Security, Artificial Intelligence and the Digital Economy, Kuala Lumpur, Malaysia, Mar. 2025, pp. 483–488

  48. [48]

    Dynamic neural control flow execution: an agent-based deep equilib- rium approach for binary vulnerability detection,

    L. Li, S. H. H. Ding, A. Walenstein, P. Charland, and B. C. M. Fung, “Dynamic neural control flow execution: an agent-based deep equilib- rium approach for binary vulnerability detection,” inThe 33rd ACM International Conference on Information and Knowledge Management (CIKM ’24), Boise, ID, USA, Oct. 2024, pp. 1215–1225

  49. [49]

    Agent4Vul: Multimodal LLM agents for smart contract vulnerability detection,

    W. Jieet al., “Agent4Vul: Multimodal LLM agents for smart contract vulnerability detection,”Science China Information Sciences, vol. 68, no. 6, p. 160101, 2025

  50. [50]

    Scalar: Self-calibrating adaptive latent attention representation learning,

    F. Abbas, H. Ahmad, and C. Szabo, “Scalar: Self-calibrating adaptive latent attention representation learning,” in2025 IEEE 37th International Conference on Tools with Artificial Intelligence (ICTAI). IEEE, 2025, pp. 762–769

  51. [51]

    Towards resource- efficient reactive and proactive auto-scaling for microservice architec- tures,

    H. Ahmad, C. Treude, M. Wagner, and C. Szabo, “Towards resource- efficient reactive and proactive auto-scaling for microservice architec- tures,”Journal of Systems and Software, vol. 225, p. 112390, 2025

  52. [52]

    Smart hpa: A resource-efficient horizontal pod auto-scaler for microservice architectures,

    ——, “Smart hpa: A resource-efficient horizontal pod auto-scaler for microservice architectures,” in2024 IEEE 21st International Conference on Software Architecture (ICSA). IEEE, 2024, pp. 46–57

  53. [53]

    Resilient auto-scaling of microservice architectures with efficient resource management,

    ——, “Resilient auto-scaling of microservice architectures with efficient resource management,”arXiv preprint arXiv:2506.05693, 2025

  54. [54]

    Co-evolutionary defence of active directory attack graphs via gnn-approximated dynamic pro- gramming,

    D. Goel, H. Ahmad, K. Moore, and M. Guo, “Co-evolutionary defence of active directory attack graphs via gnn-approximated dynamic pro- gramming,”arXiv preprint arXiv:2505.11710, 2025

  55. [55]

    A survey on immersive cyber situational awareness systems,

    H. Ahmad, F. Ullah, and R. Jafri, “A survey on immersive cyber situational awareness systems,”Journal of Cybersecurity and Privacy, vol. 5, no. 2, p. 33, 2025

  56. [56]

    Machine learning driven smishing detection framework for mobile security,

    D. Goel, H. Ahmad, A. K. Jain, and N. K. Goel, “Machine learning driven smishing detection framework for mobile security,”arXiv preprint arXiv:2412.09641, 2024

  57. [57]

    Regimefolio: A regime aware ml system for sectoral portfolio optimization in dynamic markets,

    Y . Zhang, D. Goel, H. Ahmad, and C. Szabo, “Regimefolio: A regime aware ml system for sectoral portfolio optimization in dynamic markets,” IEEE Access, 2025

  58. [58]

    3s-trader: A multi-llm framework for adaptive stock scoring, strategy, and selection in portfolio optimization,

    K. Chen, H. Ahmad, D. Goel, and C. Szabo, “3s-trader: A multi-llm framework for adaptive stock scoring, strategy, and selection in portfolio optimization,”arXiv preprint arXiv:2510.17393, 2025

  59. [59]

    The future of ai: Exploring the potential of large concept models,

    H. Ahmad and D. Goel, “The future of ai: Exploring the potential of large concept models,”arXiv preprint arXiv:2501.05487, 2025

  60. [60]

    ” i think this is the most disruptive technology

    M. U. Haque, I. Dharmadasa, Z. T. Sworna, R. N. Rajapakse, and H. Ahmad, “” i think this is the most disruptive technology”: Exploring sentiments of chatgpt early adopters using twitter data,”arXiv preprint arXiv:2212.05856, 2022

  61. [61]

    Australian bushfire intelligence with ai-driven environmental analytics,

    T. Jois, H. Ahmad, F. Noor, and F. Ullah, “Australian bushfire intelligence with ai-driven environmental analytics,”arXiv preprint arXiv:2601.06105, 2026

  62. [62]

    Robust partial least squares using low rank and sparse decomposition,

    F. Abbas and H. Ahmad, “Robust partial least squares using low rank and sparse decomposition,”arXiv preprint arXiv:2407.06936, 2024

  63. [63]

    Comparative Analysis of Large Language Models in Healthcare

    S. Santhosh, F. Abbas, H. Ahmad, and C. Szabo, “Comparative analysis of large language models in healthcare,”arXiv preprint arXiv:2604.10316, 2026

  64. [64]

    Towards deep learning enabled cybersecurity risk assessment for microservice archi- tectures,

    M. Abdulsatar, H. Ahmad, D. Goel, and F. Ullah, “Towards deep learning enabled cybersecurity risk assessment for microservice archi- tectures,”Cluster Computing, vol. 28, no. 6, pp. 350–366, 2025

  65. [65]

    Agentic AI: Autonomy, accountability, and the algorithmic society,

    A. Mukherjee and H. H. Chang, “Agentic AI: Autonomy, accountability, and the algorithmic society,” 2025

  66. [66]

    The rise of agentic AI: A review of definitions, frameworks, architectures, ap- plications, evaluation metrics, and challenges,

    A. Bandi, B. Kongari, R. Naguru, S. Pasnoor, and S. V . Vilipala, “The rise of agentic AI: A review of definitions, frameworks, architectures, ap- plications, evaluation metrics, and challenges,”Future Internet, vol. 17, no. 9, p. 404, 2025

  67. [67]

    Known ex- ploited vulnerabilities catalog,

    U.S. Cybersecurity Infrastructure and Security Agency, “Known ex- ploited vulnerabilities catalog,” [Online]. Available: https://www.CISA. gov/known-exploited-vulnerabilities-catalog, accessed: Oct. 2025

  68. [68]

    LangGraph,

    LangChain, “LangGraph,” [Online]. Available: https://docs.langchain. com/oss/python/langgraph/overview, accessed: Sept. 2025

  69. [69]

    VuMAntic v2.1,

    A. S. Arifin, “VuMAntic v2.1,” GitHub Repository. [Online]. Available: https://github.com/asrulasa/VuMAntic v2.1

  70. [70]

    microservices-demo - online boutique,

    GoogleCloudPlatform, “microservices-demo - online boutique,” GitHub Repository. [Online]. Available: https://github.com/ GoogleCloudPlatform/microservices-demo

  71. [71]

    Train-ticket,

    FudanSELab, “Train-ticket,” GitHub Repository. [Online]. Available: https://github.com/FudanSELab/train-ticket

  72. [72]

    beer-shop,

    go-kratos, “beer-shop,” GitHub Repository. [Online]. Available: https: //github.com/go-kratos/beer-shop