arxiv: 2605.00741 · v1 · submitted 2026-05-01 · 💻 cs.CR

Recognition: unknown

Self-Adaptive Multi-Agent LLM-Based Security Pattern Selection for IoT Systems

Carol Fung, Foutse Khomh, Kawser Wazed Nafi, Saeid Jamshidi

Pith reviewed 2026-05-09 19:01 UTC · model grok-4.3

classification 💻 cs.CR

keywords IoT securityself-adaptive systemsLLM agentssecurity pattern selectionMAPE-K loopresource-constrained defensemulti-agent securityedge computing

0 comments

The pith

ASPO separates LLM-generated mitigation proposals from a deterministic optimizer to guarantee conflict-free and feasible security in IoT systems.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces ASPO, a framework that combines multi-agent LLM reasoning with deterministic optimization inside a MAPE-K loop to select and activate security patterns for resource-constrained IoT environments. LLM agents generate candidate mitigation portfolios while the optimization core checks and enforces action integrity, absence of conflicts, and resource limits at each decision point. This design targets the shortcomings of static rule sets and learned policies, which often fail to provide explicit safety and feasibility guarantees under edge constraints. Evaluation on a distributed edge-gateway testbed with replayed IoT attack traffic across 500 and 1000 decisions per workload shows full compliance with safety invariants and measurable tail improvements.

Core claim

ASPO integrates LLM-based reasoning with deterministic enforcement within a MAPE-K control loop for self-adaptive security pattern selection in IoT systems. LLM agents propose candidate mitigation portfolios, while the deterministic optimisation core enforces closed-world action integrity, conflict-free composition, and resource feasibility at every decision epoch. Deployed on a distributed edge-gateway testbed and evaluated across two workloads of 500 and 1000 runtime security decisions using replayed IoT attack traffic, the system achieves 100% conflict-free activation, consistent resource feasibility, stable pattern dominance with perfect rank preservation, and tail latency and energy-com

What carries the argument

The separation of stochastic LLM proposal generation from deterministic optimisation enforcement that checks integrity, conflicts, and resource use at each epoch.

If this is right

Every security pattern activation remains free of conflicts.
Resource feasibility holds across all decision epochs in the tested workloads.
Deeper exploration of decisions compresses tail latency by 21.9 percent and tail energy by 23.1 percent.
Pattern dominance stays stable with perfect rank preservation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The hybrid structure could transfer to other constrained adaptive systems such as dynamic network routing or energy management where unsafe combinations must be prevented.
If proposal quality varies with different LLMs or prompts, the frequency of optimizer interventions would become a key performance variable to measure.
Testing the same separation on live, non-replayed attack streams could expose whether proposal diversity remains adequate outside controlled replay conditions.

Load-bearing premise

LLM agents will generate candidate portfolios of sufficient quality and diversity that the deterministic core can routinely produce feasible, effective mitigations without frequent rejection or repair.

What would settle it

A trial in which the deterministic core rejects or substantially repairs LLM proposals in more than a small fraction of epochs, producing either unsafe activations or higher resource costs than the claimed baselines.

Figures

Figures reproduced from arXiv: 2605.00741 by Carol Fung, Foutse Khomh, Kawser Wazed Nafi, Saeid Jamshidi.

**Figure 1.** Figure 1: MAPE-K-based architecture of ASPO. Monitor/Analyse: construct a structured context; Plan: combines multi-agent view at source ↗

**Figure 2.** Figure 2: Distributed edge testbed for MAPE-K-based ASPO. Ten independent edge nodes execute isolated ASPO engines over view at source ↗

**Figure 3.** Figure 3: Safety funnel illustrating Gate → Audit → Final approval filtering under a side-by-side comparison of LLM backends (DeepSeek v3, GPT-4o-mini, and LLaMA-3.1-8BInstruct) with identical constraints and replay protocol. in the 500-run setting and 995/1000 (99.5%) in the 1000- run setting, leaving a small admissible set for downstream evaluation. All gate-accepted instances are approved (4/4 and 5/5), indicati… view at source ↗

**Figure 4.** Figure 4: Deterministic portfolio score distribution across runs. view at source ↗

**Figure 5.** Figure 5: Approval probability as a function of deterministic view at source ↗

**Figure 6.** Figure 6: Approval probability as a function of portfolio size, view at source ↗

**Figure 8.** Figure 8: Runtime decomposition across LLM agents, showing view at source ↗

**Figure 9.** Figure 9: Component-level empirical cumulative distribution view at source ↗

**Figure 10.** Figure 10: Reasoner runtime by threat category, showing how decision latency of the reasoning stage varies across attack types view at source ↗

**Figure 11.** Figure 11: Auditor runtime by threat category, showing how ver view at source ↗

**Figure 12.** Figure 12: Latency comparison between the 500-run and 1000-run workloads. Panels report ECDF curves, median latency, and view at source ↗

**Figure 13.** Figure 13: Energy comparison between the 500-run and 1000-run workloads. Panels report ECDF curves, median energy, and view at source ↗

**Figure 14.** Figure 14: Example ASPO runtime trace for a DoS attack scenario. The log illustrates the interaction between the multi-agent view at source ↗

read the original abstract

The adoption of Internet of Things (IoT) systems at the network edge of smart architectures is increasing rapidly, intensifying the need for security mechanisms that are both adaptive and resource-efficient. In such environments, runtime defence mechanisms are no longer limited to detection alone but become a resource-constrained task of selecting mitigation actions. Security controls must be carefully selected, combined, and executed under latency, energy, and computational constraints, while preventing unsafe interactions between controls. Existing approaches predominantly rely on static rule sets and learned policies, which provide limited guarantees of feasibility, conflict safety, and execution correctness in resource-constrained edge settings. To address this limitation, we introduce ASPO, a self-adaptive multi-agent security pattern selection that integrates Large Language Model (LLM)-based reasoning with deterministic enforcement within a MAPE-K control loop. ASPO explicitly separates stochastic decision generation from execution: LLM agents propose candidate mitigation portfolios, while a deterministic optimisation core enforces closed-world action integrity, conflict-free composition, and resource feasibility at every decision epoch. We deploy ASPO on a distributed edge-gateway testbed and evaluate it across two workloads, each comprising 500 and 1000 runtime security decisions, using replayed IoT attack traffic. In addition, the results demonstrate invariant safety properties, including 100% conflict-free activation, consistent resource feasibility across workloads, and stable pattern dominance with perfect rank preservation. Importantly, deeper decision exploration reduces extreme-case execution costs, compressing tail latency and energy overheads by 21.9% and 23.1%, respectively, without increasing mean energy consumption.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

ASPO puts LLM agents in front of a deterministic optimizer for IoT security pattern selection, but the 100% safety and feasibility numbers are architectural guarantees rather than evidence that the LLM proposals are any good.

read the letter

The paper describes ASPO, a system that runs multiple LLM agents to generate candidate sets of security mitigations for IoT edge devices under attack, then hands those candidates to a deterministic optimization layer inside a MAPE-K loop. The optimizer checks for conflicts, resource fit, and closed-world integrity before any action runs. They tested it on a distributed gateway testbed with replayed attack traffic, running 500 and 1000 decisions per workload, and report that deeper exploration cut tail latency by 21.9% and tail energy by 23.1% while keeping mean energy flat and preserving pattern rankings.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces ASPO, a self-adaptive multi-agent framework for security pattern selection in IoT systems. It integrates LLM agents to propose candidate mitigation portfolios within a MAPE-K loop, paired with a deterministic optimization core that enforces action integrity, conflict-free composition, and resource feasibility. Evaluation on a distributed edge testbed using 500- and 1000-decision workloads from replayed IoT attack traffic claims 100% conflict-free activations, consistent feasibility, stable pattern dominance, and tail latency/energy reductions of 21.9% and 23.1% from deeper decision exploration.

Significance. If the empirical results hold and demonstrate that the LLM component meaningfully improves upon the deterministic core alone, this work could provide a valuable hybrid paradigm for runtime security adaptation in resource-constrained IoT environments, combining the exploratory power of LLMs with formal guarantees. It highlights a practical way to achieve adaptive defenses without sacrificing safety invariants.

major comments (2)

[Abstract] The claims of '100% conflict-free activation' and 'consistent resource feasibility' are architectural invariants of the deterministic optimisation core, as explicitly described in the ASPO design, rather than outcomes that could be falsified by poor LLM proposals. To substantiate the contribution of the LLM agents, the paper must report proposal acceptance rates, frequency of repairs by the optimizer, and comparisons to baselines such as static rules or pure optimization without LLM proposals.
[Evaluation] The reported tail reductions of 21.9% latency and 23.1% energy lack accompanying baselines, statistical tests, error bars, or details on the testbed implementation and workload replay. This makes it difficult to assess whether the improvements are significant or attributable to the hybrid approach.

minor comments (1)

[Abstract] Clarify what 'deeper decision exploration' entails and how it relates to the multi-agent LLM setup.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the major comments point by point below, agreeing where the manuscript can be strengthened and outlining specific revisions.

read point-by-point responses

Referee: [Abstract] The claims of '100% conflict-free activation' and 'consistent resource feasibility' are architectural invariants of the deterministic optimisation core, as explicitly described in the ASPO design, rather than outcomes that could be falsified by poor LLM proposals. To substantiate the contribution of the LLM agents, the paper must report proposal acceptance rates, frequency of repairs by the optimizer, and comparisons to baselines such as static rules or pure optimization without LLM proposals.

Authors: We agree that the 100% conflict-free activation and resource feasibility are hard invariants enforced by the deterministic optimisation core, which is a deliberate design choice to guarantee safety even with suboptimal LLM proposals. The LLM agents' contribution is in generating contextually relevant and diverse candidate portfolios that enable superior performance outcomes (e.g., the observed tail reductions). In the revised manuscript we will add: proposal acceptance rates, frequency of optimizer repairs, and direct comparisons against baselines including static rules and pure optimization without LLM proposals. These will appear in the Evaluation section with supporting tables. revision: yes
Referee: [Evaluation] The reported tail reductions of 21.9% latency and 23.1% energy lack accompanying baselines, statistical tests, error bars, or details on the testbed implementation and workload replay. This makes it difficult to assess whether the improvements are significant or attributable to the hybrid approach.

Authors: We acknowledge the need for stronger empirical framing. The 21.9% and 23.1% figures reflect gains from deeper LLM-enabled exploration versus shallower depths within the same framework. In revision we will: (i) expand testbed and workload-replay methodology details, (ii) add explicit baselines (static rules, non-LLM optimization), (iii) report statistical significance tests, and (iv) include error bars or confidence intervals. This will clarify attribution to the hybrid design. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical system with design invariants

full rationale

The paper is an empirical system description of ASPO, a MAPE-K loop combining LLM proposal generation with a deterministic optimization core. Central results (100% conflict-free activation, resource feasibility) are stated as invariants enforced by the core at every epoch rather than derived quantities. No equations, parameter fitting, self-referential definitions, or load-bearing self-citations appear in the provided text. Evaluation rests on testbed measurements over replayed workloads, which are externally falsifiable and independent of any internal derivation chain. The architecture separates stochastic and deterministic components explicitly, avoiding any reduction of outputs to inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the domain assumption that LLM agents produce viable candidate portfolios that a deterministic optimizer can reliably turn into safe, feasible decisions; no free parameters or invented physical entities are introduced, only the ASPO architecture itself.

axioms (1)

domain assumption LLM agents generate candidate mitigation portfolios of sufficient quality and diversity for the deterministic core to enforce feasibility and safety
Invoked in the description of the MAPE-K loop where stochastic proposals are filtered by deterministic checks; no evidence on proposal acceptance rate is given in the abstract.

invented entities (1)

ASPO framework no independent evidence
purpose: Self-adaptive multi-agent security pattern selection integrating LLM reasoning with deterministic enforcement
The architecture is proposed and named in this work; independent evidence would require external validation beyond the described testbed.

pith-pipeline@v0.9.0 · 5596 in / 1470 out tokens · 54976 ms · 2026-05-09T19:01:09.625601+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

58 extracted references · 18 canonical work pages · 8 internal anchors

[1]

Cybersecurity solutions for industrial internet of things– edge computing integration: Challenges, threats, and future directions,

T. Zhukabayeva, L. Zholshiyeva, N. Karabayev, S. Khan, and N. Al- nazzawi, “Cybersecurity solutions for industrial internet of things– edge computing integration: Challenges, threats, and future directions,” Sensors, vol. 25, no. 1, p. 213, 2025

2025
[2]

A review on wsn based resource constrained smart iot systems,

S. Hudda and K. Haribabu, “A review on wsn based resource constrained smart iot systems,”Discover Internet of things, vol. 5, no. 1, p. 56, 2025

2025
[3]

Towards a smart and sustainable future with edge computing-powered internet of things: Fundamentals, applications, challenges, and future research directions,

Y . Zhang and J. Feng, “Towards a smart and sustainable future with edge computing-powered internet of things: Fundamentals, applications, challenges, and future research directions,”Journal of The Institution of Engineers (India): Series B, vol. 106, no. 2, pp. 785–804, 2025

2025
[4]

Edge and cloud computing in smart cities,

M. Trigka and E. Dritsas, “Edge and cloud computing in smart cities,” Future Internet, vol. 17, no. 3, p. 118, 2025

2025
[5]

A survey on edge computing (ec) security challenges: Classification, threats, and mitigation strategies,

A. M. Sheikh, M. R. Islam, M. H. Habaebi, S. A. Zabidi, A. R. Bin Najeeb, and A. Kabbani, “A survey on edge computing (ec) security challenges: Classification, threats, and mitigation strategies,”Future Internet, vol. 17, no. 4, p. 175, 2025

2025
[6]

Secure container orchestration: A framework for detecting and mitigating orchestrator- level vulnerabilities,

V . Mahavaishnavi, R. Saminathan, and R. Prithviraj, “Secure container orchestration: A framework for detecting and mitigating orchestrator- level vulnerabilities,”Multimedia Tools and Applications, vol. 84, no. 17, pp. 18 351–18 371, 2025

2025
[7]

Addressing security orchestration challenges in next-generation net- works: A comprehensive overview,

S. Batewela, P. Ranaweera, M. Liyanage, E. Zeydan, and M. Ylianttila, “Addressing security orchestration challenges in next-generation net- works: A comprehensive overview,”IEEE Open Journal of the Computer Society, 2025

2025
[8]

The design of secure iot applications using patterns: State of the art and directions for research,

E. B. Fernandez, H. Washizaki, N. Yoshioka, and T. Okubo, “The design of secure iot applications using patterns: State of the art and directions for research,”Internet of Things, vol. 15, p. 100408, 2021

2021
[9]

A decade of research on patterns and architectures for iot security,

T. Rajmohan, P. H. Nguyen, and N. Ferry, “A decade of research on patterns and architectures for iot security,”Cybersecurity, vol. 5, no. 1, p. 2, 2022

2022
[10]

Deep reinforcement learning for cyber security: A survey,

T. Nguyenet al., “Deep reinforcement learning for cyber security: A survey,”IEEE Access, vol. 11, pp. 23 456–23 478, 2023

2023
[11]

Reinforcement learning-based intrusion detection for iot,

L. Wanget al., “Reinforcement learning-based intrusion detection for iot,”IEEE Internet of Things Journal, vol. 10, no. 8, pp. 6789–6801, 2023

2023
[12]

Multi-objective adaptive rate limiting in microservices using deep reinforcement learn- ing,

N. Lyu, Y . Wang, Z. Cheng, Q. Zhang, and F. Chen, “Multi-objective adaptive rate limiting in microservices using deep reinforcement learn- ing,” inProceedings of the 4th International Conference on Artificial Intelligence and Intelligent Information Processing, 2025, pp. 862–869

2025
[13]

Rs-drl: Managing uncertainty in self- adaptive systems based on a novel continuous deep reinforcement learning method,

A. Kavianifar and S. Jalili, “Rs-drl: Managing uncertainty in self- adaptive systems based on a novel continuous deep reinforcement learning method,”ACM Transactions on Autonomous and Adaptive Systems, 2025

2025
[14]

A dynamic secu- rity pattern selection framework using deep reinforcement learning,

S. Jamshidi, A. Nikanjam, K. W. Nafi, and F. Khomh, “A dynamic secu- rity pattern selection framework using deep reinforcement learning,” in 2025 IEEE International Conference on Software Services Engineering (SSE). IEEE, 2025, pp. 98–108

2025
[15]

Understanding the impact of iot security patterns on cpu usage and energy consumption: a dynamic approach for selecting patterns with deep reinforcement learning,

J. Saeid, “Understanding the impact of iot security patterns on cpu usage and energy consumption: a dynamic approach for selecting patterns with deep reinforcement learning,”International Journal of Information Security, vol. 24, no. 2, p. 91, 2025

2025
[16]

Energy-efficient security mechanisms in iot edge networks,

Y . Zhanget al., “Energy-efficient security mechanisms in iot edge networks,”IEEE Transactions on Sustainable Computing, 2023

2023
[18]

GPT-4 Technical Report

[Online]. Available: https://arxiv.org/abs/2303.08774

work page internal anchor Pith review Pith/arXiv arXiv
[19]

arXiv preprint arXiv:2308.11432 , year=

L. Huanget al., “A survey of large language model-based autonomous agents,”arXiv preprint arXiv:2308.11432, 2023. [Online]. Available: https://arxiv.org/abs/2308.11432

work page arXiv 2023
[20]

AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation

Q. Wu, G. Bansal, J. Zhang, Y . Wu, B. Li, E. Zhu, L. Jiang, X. Zhang, S. Zhang, J. Liu, A. H. Awadallah, R. W. White, 17 D. Burger, and C. Wang, “Autogen: Enabling next-gen llm applications via multi-agent conversation,”arXiv preprint arXiv:2308.08155, 2023. [Online]. Available: https://arxiv.org/abs/2308.08155

work page internal anchor Pith review Pith/arXiv arXiv 2023
[21]

CAMEL: Communicative Agents for "Mind" Exploration of Large Language Model Society

G. Li, H. Hammoud, H. Itani, D. Khizbullin, and B. Ghanem, “Camel: Communicative agents for mind exploration of large scale language model society,”arXiv preprint arXiv:2303.17760, 2023. [Online]. Available: https://arxiv.org/abs/2303.17760

work page internal anchor Pith review arXiv 2023
[22]

Generative Agents: Interactive Simulacra of Human Behavior

J. S. Parket al., “Generative agents: Interactive simulacra of human behavior,”ACM UIST, 2023. [Online]. Available: https: //arxiv.org/abs/2304.03442

work page internal anchor Pith review arXiv 2023
[23]

Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection

K. Greshakeet al., “More than you’ve asked for: A comprehensive analysis of prompt injection vulnerabilities,” arXiv preprint arXiv:2302.12173, 2023. [Online]. Available: https://arxiv.org/abs/2302.12173

work page internal anchor Pith review arXiv 2023
[24]

Multi-Agent Collaboration Mechanisms: A Survey of LLMs

K.-T. Tran, D. Dao, M.-D. Nguyen, Q.-V . Pham, B. O’Sullivan, and H. D. Nguyen, “Multi-agent collaboration mechanisms: A survey of llms,”arXiv preprint arXiv:2501.06322, 2025

work page internal anchor Pith review arXiv 2025
[25]

Nemo guardrails: A toolkit for controllable and safe llm applications with programmable rails,

T. Rebedea, R. Dinu, M. Sreedhar, C. Parisien, and J. Cohen, “Nemo guardrails: A toolkit for controllable and safe llm applications with programmable rails,” inProceedings of EMNLP 2023 System Demonstrations, 2023. [Online]. Available: https://aclanthology.org/ 2023.emnlp-demo.40/

2023
[26]

Baseline Defenses for Adversarial Attacks Against Aligned Language Models

N. Jainet al., “Baseline defenses for prompt injection attacks against llm agents,”arXiv preprint arXiv:2309.00614, 2023. [Online]. Available: https://arxiv.org/abs/2309.00614

work page internal anchor Pith review arXiv 2023
[27]

A comprehensive technological survey on the dependable self-management cps: From self-adaptive architecture to self-management strategies,

P. Zhou, D. Zuo, K. M. Hou, Z. Zhang, J. Dong, J. Li, and H. Zhou, “A comprehensive technological survey on the dependable self-management cps: From self-adaptive architecture to self-management strategies,” Sensors, vol. 19, no. 5, p. 1033, 2019

2019
[28]

The vision of autonomic computing,

J. O. Kephart and D. M. Chess, “The vision of autonomic computing,” Computer, vol. 36, no. 1, pp. 41–50, 2003

2003
[29]

Powering intelligent intrusion responses to advanced cyber threats using reinforcement learning,

E. Iturbe, J. Garcia, and M. Lopez, “Powering intelligent intrusion responses to advanced cyber threats using reinforcement learning,” Expert Systems with Applications, 2025

2025
[30]

Adaptive network intrusion detection using reinforcement learning with proximal policy optimization,

A. Suresh and A. Cyril Jose, “Adaptive network intrusion detection using reinforcement learning with proximal policy optimization,”ACM Transactions on Privacy and Security, vol. 28, no. 4, pp. 1–24, 2025

2025
[31]

Adaptive security policy man- agement in cloud environments using reinforcement learning,

M. Saqib, D. Mehta, and W. Chen, “Adaptive security policy man- agement in cloud environments using reinforcement learning,”arXiv preprint arXiv:2505.08837, 2025

work page arXiv 2025
[32]

D3o-iiot: Dynamic deception orchestration in industrial iot via deep reinforcement learning,

D. Wushishiet al., “D3o-iiot: Dynamic deception orchestration in industrial iot via deep reinforcement learning,”Scientific Reports, 2025, check final volume/pages on journal site. [Online]. Available: https://www.nature.com/articles/s41598-025-90839-8

2025
[33]

Distributed computing in multi-agent systems: a survey of decentralized machine learning approaches,

I. Ahmed, M. A. Syed, M. Maaruf, and M. Khalid, “Distributed computing in multi-agent systems: a survey of decentralized machine learning approaches,”Computing, vol. 107, no. 1, p. 2, 2025

2025
[34]

Guidelines for applying rl and marl in cybersecurity applications,

V . Mavroudis, G. Palmer, S. Farmer, K. S. Whitehead, D. Fos- ter, A. Price, I. Miles, A. Caron, and S. Pasteris, “Guidelines for applying rl and marl in cybersecurity applications,”arXiv preprint arXiv:2503.04262, 2025

work page arXiv 2025
[35]

Pentestgpt: Evaluating and harnessing large language models for automated penetration testing,

G. Deng, Y . Liu, V . Mayoral-Vilches, P. Liu, Y . Li, Y . Xu, T. Zhang, Y . Liu, M. Pinzger, and S. Rass, “Pentestgpt: Evaluating and harnessing large language models for automated penetration testing,” in33rd USENIX Security Symposium (USENIX Security 24), 2024, pp. 847–864. [Online]. Available: https://www.usenix.org/conference/ usenixsecurity24/presen...

2024
[36]

Pentestagent: Incorporating llm agents to automated penetration testing.arXiv preprint arXiv:2411.05185, 2024

X. Shen, L. Wang, Z. Li, Y . Chen, W. Zhao, D. Sun, J. Wang, and W. Ruan, “Pentestagent: Incorporating llm agents to automated penetration testing,”CoRR, vol. abs/2411.05185, 2024. [Online]. Available: https://arxiv.org/abs/2411.05185

work page arXiv 2024
[37]

and Li, Z., 2024

J. Xu, J. W. Stokes, G. McDonald, X. Bai, D. Marshall, S. Wang, A. Swaminathan, and Z. Li, “Autoattacker: A large language model guided system to implement automatic cyber-attacks,”CoRR, vol. abs/2403.01038, 2024. [Online]. Available: https://arxiv.org/abs/2403. 01038

work page arXiv 2024
[38]

Forewarned is forearmed: A survey on large language model-based agents in autonomous cyberattacks,

M. Xu, J. Fan, X. Huang, C. Zhou, J. Kang, D. Niyato, S. Mao, Z. Han, K.-Y . Lamet al., “Forewarned is forearmed: A survey on large language model-based agents in autonomous cyberattacks,”arXiv preprint arXiv:2505.12786, 2025

work page arXiv 2025
[39]

Evaluating ai agents for cyber defense: A comparison of deep reinforcement learning and llm approaches,

H. Chowdhry, J. Manero, and S. Sampalli, “Evaluating ai agents for cyber defense: A comparison of deep reinforcement learning and llm approaches,” inInternational Conference on Intelligent Data Engineer- ing and Automated Learning. Springer, 2025, pp. 423–433

2025
[40]

Sizhe Chen, Julien Piet, Chawin Sitawarin, and David Wagner

S. Chen, Y . Wang, N. Carlini, C. Sitawarin, and D. Wagner, “Defending against prompt injection with a few defensivetokens,”CoRR, vol. abs/2507.07974, 2025. [Online]. Available: https://arxiv.org/abs/2507. 07974

work page arXiv 2025
[41]

Secure tool use in large language model agents,

Y . Song, J. Li, and K. Patel, “Secure tool use in large language model agents,”IEEE Transactions on Dependable and Secure Computing, 2024

2024
[42]

From allies to adversaries: Manipulating llm tool-calling through adversarial injection,

R. Zhang, H. Wang, J. Wang, M. Li, Y . Huang, D. Wang, and Q. Wang, “From allies to adversaries: Manipulating llm tool-calling through adversarial injection,” inProceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2025, pp. 2009–2028

2025
[43]

Security control grid for optimized cyber defense planning,

A. Dutta, E. Al-Shaer, E. Aghaei, Q. Duan, and H. Yasar, “Security control grid for optimized cyber defense planning,”IEEE Transactions on Network and Service Management, vol. 22, no. 1, pp. 913–929, 2024

2024
[44]

Optimizing security in dynamic service migration scenarios of multi- access edge computing,

P. Ranaweera, I. Balapuwaduge, A. Jurcut, E. Zeydan, and M. Liyanage, “Optimizing security in dynamic service migration scenarios of multi- access edge computing,” in2025 IEEE 102nd Vehicular Technology Conference (VTC2025-Fall). IEEE, 2025, pp. 1–6

2025
[45]

Dis- tributed multi-objective optimization for edge computing in resource- constrained social iot networks,

I. Bisio, C. Garibotto, F. Lavagetto, A. Sciarrone, and M. Zerbino, “Dis- tributed multi-objective optimization for edge computing in resource- constrained social iot networks,”IEEE Internet of Things Journal, 2026

2026
[46]

Prompt Injection attack against LLM-integrated Applications

Y . Liu, G. Deng, Y . Li, K. Wang, Z. Wang, X. Wang, T. Zhang, Y . Liu, H. Wang, Y . Zhenget al., “Prompt injection attack against llm-integrated applications,”arXiv preprint arXiv:2306.05499, 2023

work page internal anchor Pith review arXiv 2023
[47]

Not what you’ve signed up for: Compromising real-world llm- integrated applications with indirect prompt injection,

K. Greshake, S. Abdelnabi, S. Mishra, C. Endres, T. Holz, and M. Fritz, “Not what you’ve signed up for: Compromising real-world llm- integrated applications with indirect prompt injection,” inProceedings of the 16th ACM workshop on artificial intelligence and security, 2023, pp. 79–90

2023
[48]

The bot-iot dataset,

UNSW Canberra Cyber, “The bot-iot dataset,” https://research.unsw.edu. au/projects/bot-iot-dataset, 2018

2018
[49]

Precise realtime current consumption measurement in iot testbed,

R. Balass, V . Medvedevs, A. I. Mackus, J. Ormanis, A. Ancans, and J. Judvaitis, “Precise realtime current consumption measurement in iot testbed,”Open Research Europe, vol. 3, p. 27, 2024

2024
[50]

A practical testbed for decentralized federated learning on physical edge devices,

C. Feng, N. Huber, A. H. Celdr ´an, G. Bovet, and B. Stiller, “A practical testbed for decentralized federated learning on physical edge devices,” in2025 IEEE 50th Conference on Local Computer Networks (LCN). IEEE, 2025, pp. 1–3

2025
[51]

Statistical notes for clinical researchers: Chi-squared test and fisher’s exact test,

H.-Y . Kim, “Statistical notes for clinical researchers: Chi-squared test and fisher’s exact test,”Restorative dentistry & endodontics, vol. 42, no. 2, p. 152, 2017

2017
[52]

Comparison of values of pearson’s and spearman’s correlation coefficients on the same sets of data,

J. Hauke and T. Kossowski, “Comparison of values of pearson’s and spearman’s correlation coefficients on the same sets of data,”Quaes- tiones geographicae, vol. 30, no. 2, pp. 87–93, 2011

2011
[53]

Using kolmogorov-smirnov distance for measuring distribution shift in machine learning,

O. K. Tonguz and F. Taschin, “Using kolmogorov-smirnov distance for measuring distribution shift in machine learning,”arXiv preprint arXiv:2510.15996, 2025

work page arXiv 2025
[54]

Continuous variable analyses: student’s t-test, mann– whitney u test, wilcoxon signed-rank test,

B. Thakkar, “Continuous variable analyses: student’s t-test, mann– whitney u test, wilcoxon signed-rank test,” inTranslational cardiology. Elsevier, 2025, pp. 165–167

2025
[55]

Statistical notes for clinical researchers: Two-way analysis of variance (anova)-exploring possible interaction between factors,

H.-Y . Kim, “Statistical notes for clinical researchers: Two-way analysis of variance (anova)-exploring possible interaction between factors,” Restorative dentistry & endodontics, vol. 39, no. 2, pp. 143–147, 2014

2014
[56]

Design and evaluation of an autonomous cyber defence agent using drl and an augmented llm,

J. Loevenich, E. Adler, T. H ¨urten, and R. R. F. Lopes, “Design and evaluation of an autonomous cyber defence agent using drl and an augmented llm,”Computer Networks, vol. 262, p. 111162, 2025

2025
[57]

CoRR abs/2406.09187 (2024)

A. (arXiv submission metadata), “Guardagent: Safeguard llm agents by a guard agent via knowledge-enabled reasoning,” arXiv:2406.09187, 2024

work page arXiv 2024
[58]

Security of llm-based agents regarding attacks, defenses, and applications: A comprehensive survey,

Y . Tang, Y . Liu, J. Lan, Z. Yan, and E. Gelenbe, “Security of llm-based agents regarding attacks, defenses, and applications: A comprehensive survey,”Information Fusion, p. 103941, 2026

2026
[59]

Entity-based re- inforcement learning for autonomous cyber defence,

I. S. Thompson, A. Caron, V . Mavroudis, and C. Hicks, “Entity-based re- inforcement learning for autonomous cyber defence,” arXiv:2410.17647, 2024. 18 APPENDIX [Edge Gateway: Node-02] Timestamp: 2026-03-06 14:12:07 ------------------------------------------------------------ →New ASPO decision epoch detected [epoch: 18] →Threat context: DoS attack Severi...

work page arXiv 2024