pith. sign in

arxiv: 2607.02357 · v1 · pith:RUGNNXTNnew · submitted 2026-07-02 · 💻 cs.CR · cs.SE

Cloak and Detonate: Scanner Evasion and Dynamic Detection of Agent Skill Malware

Pith reviewed 2026-07-03 10:26 UTC · model grok-4.3

classification 💻 cs.CR cs.SE
keywords agent skillsmalware evasionstatic scannersruntime detectiontaint analysissupply chain securityLLM security
0
0 comments X

The pith

Static scanners for LLM agent skills fail against payload-preserving evasions, but runtime sandbox execution with taint tracking detects most malicious behavior.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that static scanners relying on pattern matching or code judgment cannot reliably identify malicious third-party skills because attackers can alter appearance without changing attack effects. It introduces SkillCloak to demonstrate concrete evasion methods that keep malicious semantics intact. Motivated by these failures, the work presents SkillDetonate as an alternative that runs skills in a sandbox to observe real actions through information flow rather than install-time form. This behavior-centric method is positioned as a stronger defense for the software supply chain created by agent marketplaces.

Core claim

SkillCloak applies structural obfuscation to rewrite visible indicators into equivalent forms and self-extracting skill packing to conceal components until execution time, allowing attacks to bypass static and hybrid scanners. SkillDetonate responds by executing skills under sandbox control, using on-demand closure lift to capture runtime instructions and marker-based taint analysis to follow sensitive data across files, processes, and network operations, thereby identifying malicious effects through observed OS-boundary flows instead of code appearance.

What carries the argument

Marker-based taint analysis during sandboxed execution that tracks sensitive-data flows across agent context, files, processes, and network operations.

If this is right

  • Static appearance-based auditing cannot withstand adaptive evasions that preserve attack semantics.
  • Sandbox execution surfaces instructions and effects hidden from install-time inspection.
  • Tracking information flows at OS boundaries yields signals of malice independent of code structure.
  • Behavior-centric auditing addresses the supply-chain risk created by third-party skills executing with agent privileges.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Defenses for agent skills may need to shift from pre-install checks to continuous runtime monitoring in production environments.
  • Evasion patterns shown here could transfer to other automated code-execution marketplaces that rely on static review.
  • Taint tracking at the OS boundary might extend to detecting similar hidden threats in non-LLM automated scripting systems.

Load-bearing premise

The sandboxed execution and marker-based taint analysis will observe and correctly classify malicious effects without the malware detecting the sandbox or evading the information-flow tracking.

What would settle it

A malicious skill that performs credential theft, code exfiltration, or backdoor installation inside the sandbox without producing detectable taint flows or OS-boundary evidence would falsify the detection claim.

Figures

Figures reproduced from arXiv: 2607.02357 by Congying Xu, Shing-Chi Cheung, Shuai Wang, Xin Wei, Yudong Gao, Zimo Ji, Zongjie Li.

Figure 1
Figure 1. Figure 1: Overview of SKILLCLOAK and SKILLDETONATE. two production agents (Codex and Claude Code) with 1,613 in￾the-wild malicious skills. We propose SKILLCLOAK, a payload￾preserving evasion framework that adaptively transforms mali￾cious skills while preserving their malicious behavior. SKILLCLOAK operates on two complementary strategies (Section III). 1) Structural Obfuscation rewrites concrete payload indica￾tors… view at source ↗
Figure 2
Figure 2. Figure 2: Overview of the Structural Obfuscation workflow. [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Overview of the SFS Packing workflow. payload from the scanner’s view entirely. It is the skill-level analogue of self-extracting (SFX) packing in binary malware: a packer seals the real payload inside a self-extracting archive that unpacks at runtime, so a static scan sees only a benign stub and an opaque blob, never the malicious code. The same idea applies here, except the unpacking is driven by the age… view at source ↗
Figure 4
Figure 4. Figure 4: Results of SKILLCLOAK’s per-scanner bypass rate. our metric is relative to a baseline detection, so a dataset where every skill carries a real, independently confirmed finding gives each attack a genuine signal to nullify, and using real rather than synthetic skills shows the attack is already actionable on what a marketplace would receive. Metrics. We report two quantities over the skills that each scanne… view at source ↗
Figure 5
Figure 5. Figure 5: Overview of SKILLDETONATE. context and opaque cross-process byte transforms, both judged by two information-flow policies at the OS boundary. B. Behavior-Centric Auditor: SKILLDETONATE SKILLDETONATE takes a suspicious skill as input and outputs a malicious or safe verdict. Its guiding principle is to decide from what a skill does rather than what it looks like: SKILLDETONATE runs the skill in a single sand… view at source ↗
Figure 6
Figure 6. Figure 6: Root cause of missing in-the-wild malicious skills. [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗
read the original abstract

LLM coding agents increasingly rely on third-party agent skills from public marketplaces, which execute with the agent's privileges and create a software supply-chain attack surface: a malicious skill can steal credentials, exfiltrate source code, or install backdoors. Existing defenses use static skill scanners based on pattern matching or LLM-as-judge analysis, but it remains unclear whether they withstand adaptive evasions that preserve malicious behavior while changing payload appearance. This paper first presents an adversarial study of existing skill scanners through SkillCloak, a payload-preserving evasion framework that keeps the attack semantics intact while transforming their visible form. SkillCloak uses two complementary strategies: Structural Obfuscation, which rewrites visible payload indicators into semantically equivalent forms, and Self-Extracting Skill (SFS) Packing, which hides malicious components from the install-time view and restores them during agent execution. Across eight scanners and 1,613 in-the-wild malicious skills, SFS Packing bypasses every scanner at over 90%, while Structural Obfuscation bypasses over 80% on most static scanners and reaches 96% on a hybrid scanner, showing that appearance-based auditing is insufficient. Motivated by this finding, we propose SkillDetonate, a behavior-centric runtime auditor that executes skills in a sandbox and detects malicious effects through OS-boundary information-flow evidence rather than install-time appearance. SkillDetonate combines on-demand closure lift, which observes instructions materialized during execution, with marker-based taint analysis, which tracks sensitive-data flows across the agent context, files, processes, and network operations. The results show that SkillDetonate detects 97% of attacks at a 2% false-positive rate and sustains 87% detection on real-world malicious skills.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper introduces SkillCloak, an evasion framework using structural obfuscation and self-extracting skill packing to bypass static scanners on 1,613 real-world malicious skills, achieving >90% bypass with SFS packing and >80% with obfuscation across eight scanners. Motivated by this, it proposes SkillDetonate, a sandboxed runtime auditor employing on-demand closure lift and marker-based taint analysis to detect malicious effects via information flows, reporting 97% detection at 2% FPR and 87% on real-world skills.

Significance. If the results hold, the work is significant for demonstrating that appearance-based static scanners are insufficient against semantics-preserving evasions in LLM agent skill marketplaces and for proposing a behavior-centric dynamic alternative grounded in OS-boundary evidence. The scale of the empirical study on in-the-wild skills is a clear strength.

major comments (1)
  1. [Abstract and SkillDetonate evaluation] Abstract and § on SkillDetonate evaluation: the central performance claims (97% detection at 2% FPR; 87% on real-world skills) are load-bearing on the untested assumption that the sandbox and marker-based taint tracking will observe malicious flows even when malware can inspect its environment. No results are provided for adaptive skills that fingerprint the sandbox (timing, API responses, file artifacts), use conditional/delayed execution, or apply anti-taint techniques such as implicit flows or environment-specific encoding.
minor comments (2)
  1. [Abstract] The abstract states precise numerical results (bypass percentages, detection rates) without cross-references to the corresponding experimental sections, datasets, or controls.
  2. Notation for 'on-demand closure lift' and 'marker-based taint analysis' is introduced without an early formal definition or pseudocode, which would aid reproducibility.

Simulated Author's Rebuttal

1 responses · 1 unresolved

We thank the referee for the constructive feedback on the evaluation assumptions underlying SkillDetonate. We address the major comment below and will revise the manuscript to better scope our claims.

read point-by-point responses
  1. Referee: [Abstract and SkillDetonate evaluation] Abstract and § on SkillDetonate evaluation: the central performance claims (97% detection at 2% FPR; 87% on real-world skills) are load-bearing on the untested assumption that the sandbox and marker-based taint tracking will observe malicious flows even when malware can inspect its environment. No results are provided for adaptive skills that fingerprint the sandbox (timing, API responses, file artifacts), use conditional/delayed execution, or apply anti-taint techniques such as implicit flows or environment-specific encoding.

    Authors: We agree that the reported detection rates (97% at 2% FPR; 87% on real-world skills) were measured exclusively on the 1,613 in-the-wild malicious skills and the SkillCloak-generated variants, none of which incorporate sandbox fingerprinting, conditional/delayed execution, or anti-taint techniques. SkillDetonate's design intentionally focuses on OS-boundary effects (file, process, and network operations) that are required for the attack to succeed, making some forms of evasion more difficult than pure appearance changes. Nevertheless, the referee is correct that we provide no empirical results against adaptive adversaries targeting the sandbox or taint tracker. We will revise the abstract, evaluation section, and limitations discussion to explicitly qualify the results as applying to non-adaptive malware and to identify sandbox-aware and anti-taint evasion as an open question for future work. No new experimental results can be added at this stage. revision: partial

standing simulated objections not resolved
  • Empirical results for SkillDetonate against adaptive malware using sandbox fingerprinting, conditional execution, or anti-taint techniques

Circularity Check

0 steps flagged

No circularity: empirical evaluation of evasion and detection tools

full rationale

The paper describes an empirical adversarial study (SkillCloak evasion) followed by a runtime detection system (SkillDetonate) whose performance numbers are obtained by direct execution of skills in a sandbox with taint tracking. No derivation chain, equations, or first-principles predictions are presented that reduce to fitted inputs or self-citations by construction. The 97% / 2% FPR and 87% real-world figures are reported outcomes of the evaluation procedure itself, not outputs of a model whose parameters were tuned on the same metrics. The work is therefore self-contained as a standard empirical security study.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Based on the abstract alone, no free parameters, axioms, or invented entities are specified.

pith-pipeline@v0.9.1-grok · 5873 in / 1093 out tokens · 44231 ms · 2026-07-03T10:26:03.914813+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

68 extracted references · 23 canonical work pages · 17 internal anchors

  1. [1]

    SWE-bench: Can language models resolve real-world github issues?

    C. E. Jimenez, J. Yang, A. Wettig, S. Yao, K. Pei, O. Press, and K. Narasimhan, “SWE-bench: Can language models resolve real-world github issues?” inInternational Conference on Learning Representations (ICLR), 2024

  2. [2]

    Cybench: A framework for evaluating cybersecurity capabilities and risks of language models,

    A. K. Zhanget al., “Cybench: A framework for evaluating cybersecurity capabilities and risks of language models,” inInternational Conference on Learning Representations (ICLR), 2025

  3. [3]

    GPT-Driver: Learning to Drive with GPT

    J. Mao, Y . Qian, J. Ye, H. Zhao, and Y . Wang, “GPT-Driver: Learning to drive with GPT,”arXiv preprint arXiv:2310.01415, 2023

  4. [4]

    Claude Code,

    Anthropic, “Claude Code,” https://code.claude.com/, 2026

  5. [5]

    Codex: AI coding partner from OpenAI,

    OpenAI, “Codex: AI coding partner from OpenAI,” https://openai.com/ codex/, 2026

  6. [6]

    Introducing agent skills,

    Anthropic, “Introducing agent skills,” https://www.anthropic.com/news/ skills, 2025

  7. [7]

    AI agents market report 2025–2030, by applica- tion, geo, tech,

    MarketsandMarkets, “AI agents market report 2025–2030, by applica- tion, geo, tech,” https://www.marketsandmarkets.com/Market-Reports/ ai-agents-market-15761548.html, 2025, report code TC 9264. Accessed 2026

  8. [8]

    Agent skills: A data-driven analysis of claude skills for extending large language model functionality,

    G. Ling, S. Zhong, and R. Huang, “Agent skills: A data-driven analysis of claude skills for extending large language model functionality,”arXiv preprint arXiv:2602.08004, 2026

  9. [9]

    ClawHavoc: 341 malicious Clawed skills found by the bot they were targeting,

    Koi Security, “ClawHavoc: 341 malicious Clawed skills found by the bot they were targeting,” https://www.koi.ai/blog/ clawhavoc-341-malicious-clawedbot-skills-found-by-the-bot-they-were-targeting, 2026, accessed 2026

  10. [10]

    How Your Credentials Are Leaked by LLM Agent Skills: An Empirical Study

    Z. Chen and Y . Zhang, “How your credentials are leaked by LLM agent skills: An empirical study,”arXiv preprint arXiv:2604.03070, 2026

  11. [11]

    Snyk finds prompt injection in 36%, 1467 malicious payloads in a ToxicSkills study of agent skills supply chain compromise,

    Snyk Security Labs, “Snyk finds prompt injection in 36%, 1467 malicious payloads in a ToxicSkills study of agent skills supply chain compromise,” https://snyk.io/blog/toxicskills-malicious-ai-agent-skills-clawhub/, 2026

  12. [12]

    Agent Skills in the Wild: An Empirical Study of Security Vulnerabilities at Scale

    Y . Liu and W. Wang, “Agent skills in the wild: An empirical study of security vulnerabilities at scale,”arXiv preprint arXiv:2601.10338, 2026

  13. [13]

    "Do Not Mention This to the User": Detecting and Understanding Malicious Agent Skills in the Wild

    Y . Liu, Z. Chen, Y . Zhang, G. Deng, Y . Li, J. Ning, and L. Y . Zhang, ““do not mention this to the user”: Detecting and understanding malicious agent skills in the wild,”arXiv preprint arXiv:2602.06547, 2026

  14. [14]

    SkillSieve: A Hierarchical Triage Framework for Detecting Malicious AI Agent Skills

    Y . Hou, Z. Yang, Z. Pang, and X. Ma, “SkillSieve: A hierarchical triage framework for detecting malicious AI agent skills,”arXiv preprint arXiv:2604.06550, 2026

  15. [15]

    CASCADE: A Cascaded Hybrid Defense Architecture for Prompt Injection Detection in MCP-Based Systems

    ˙I. A. Turgut and E. Gümü¸ s, “CASCADE: A cascaded hybrid defense architecture for prompt-injection detection in MCP-based systems,”arXiv preprint arXiv:2604.17125, 2026

  16. [16]

    "elementary, my dear watson

    S. Wang and J. He, “"elementary, my dear watson." detecting malicious skills via neuro-symbolic reasoning across heterogeneous artifacts,”arXiv preprint arXiv:2603.27204, 2026

  17. [17]

    GitHub – openclaw/skills: All versions of all skills that are on clawhub.com archived · GitHub,

    OpenClaw, “GitHub – openclaw/skills: All versions of all skills that are on clawhub.com archived · GitHub,” https://github.com/openclaw/skills, 2026, public archive of all skills published to ClawHub, including malicious skills removed from the live marketplace. Archived snap- shot: https://web.archive.org/web/20260406111320/https://github.com/ openclaw/skills

  18. [18]

    SkillJect: Effectively Automating Skill-Based Prompt Injection for Skill-Enabled Agents

    X. Jia, J. Liao, S. Qin, J. Gu, W. Ren, X. Cao, Y . Liu, and P. Torr, “SkillJect: Effectively automating skill-based prompt injection for skill- enabled agents,”arXiv preprint arXiv:2602.14211, 2026

  19. [19]

    MalSkillBench: A Runtime-Verified Benchmark of Malicious Agent Skills

    W. Guo, W. Zeng, C. Liu, X. Jia, Y . Xu, L. Tang, Y . Fang, and Y . Liu, “MalSkillBench: A runtime-verified benchmark of malicious agent skills,” arXiv preprint arXiv:2606.07131, 2026

  20. [20]

    Skill-Inject: Measuring Agent Vulnerability to Skill File Attacks

    D. Schmotz, L. Beurer-Kellner, S. Abdelnabi, and M. Andriushchenko, “Skill-Inject: Measuring agent vulnerability to skill file attacks,”arXiv preprint arXiv:2602.20156, 2026

  21. [21]

    Supply-Chain Poisoning Attacks Against LLM Coding Agent Skill Ecosystems

    Y . Qu and Y . Liu, “Supply-chain poisoning attacks against LLM coding agent skill ecosystems,”arXiv preprint arXiv:2604.03081, 2026

  22. [22]

    skill-security-scan,

    HuiFer, “skill-security-scan,” https://github.com/huifer/skill-security-scan, 2026

  23. [23]

    vigile-scan,

    Vigile.dev, “vigile-scan,” https://github.com/Vigile-ai/vigile-scan, 2026

  24. [24]

    SkillFortify: Formal capability verification for agent skills,

    V . P. Bhardwaj, “SkillFortify: Formal capability verification for agent skills,” https://github.com/qualixar/skillfortify, 2026

  25. [25]

    skill-audit,

    M. Pors, “skill-audit,” https://github.com/pors/skill-audit, 2026

  26. [26]

    agent-audit,

    HeadyZhang, “agent-audit,” https://github.com/HeadyZhang/agent-audit, 2026

  27. [27]

    Skillscanner,

    G. Patidar, “Skillscanner,” https://github.com/patidarganesh/SkillScanner, 2026

  28. [28]

    caterpillar,

    Alice IO, “caterpillar,” https://github.com/alice-dot-io/caterpillar, 2026

  29. [29]

    Cisco AI defense skill scanner,

    Cisco AI Defense, “Cisco AI defense skill scanner,” https://github.com/ cisco-ai-defense/skill-scanner, 2026

  30. [30]

    nova-proximity,

    T. Roccia, “nova-proximity,” https://github.com/fr0gger/proximity, 2026

  31. [31]

    Code obfuscation techniques for metamorphic viruses,

    J.-M. Borello and L. Mé, “Code obfuscation techniques for metamorphic viruses,”Journal in Computer Virology, vol. 4, no. 3, pp. 211–220, 2008

  32. [32]

    PhantomSkill: Malicious Code Injection in Agent Skill Ecosystems

    Y .-T. Lin and C.-M. Yu, “PhantomSkill: Malicious code injection in agent skill ecosystems,”arXiv preprint arXiv:2606.19191, 2026

  33. [33]

    SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks

    X. Liet al., “SkillsBench: Benchmarking how well agent skills work across diverse tasks,”arXiv preprint arXiv:2602.12670, 2026

  34. [34]

    Backstabber’s knife collection: A review of open source software supply chain attacks,

    M. Ohm, H. Plate, A. Sykosch, and M. Meier, “Backstabber’s knife collection: A review of open source software supply chain attacks,” inDetection of Intrusions and Malware, and Vulnerability Assessment (DIMVA), 2020

  35. [35]

    Detecting environment-sensitive malware,

    M. Lindorfer, C. Kolbitsch, and P. M. Comparetti, “Detecting environment-sensitive malware,” inInternational Symposium on Recent Advances in Intrusion Detection (RAID), 2011

  36. [36]

    BareCloud: Bare-metal analysis- based evasive malware detection,

    D. Kirat, G. Vigna, and C. Kruegel, “BareCloud: Bare-metal analysis- based evasive malware detection,” inUSENIX Security Symposium, 2014, pp. 287–301

  37. [37]

    X-Force: Force- executing binary programs for security applications,

    F. Peng, Z. Deng, X. Zhang, D. Xu, Z. Lin, and Z. Su, “X-Force: Force- executing binary programs for security applications,” inUSENIX Security Symposium, 2014, pp. 829–844

  38. [38]

    Computer viruses: Theory and experiments,

    F. Cohen, “Computer viruses: Theory and experiments,”Computers & Security, vol. 6, no. 1, pp. 22–35, 1987

  39. [39]

    Semantics-aware malware detection,

    M. Christodorescu, S. Jha, S. A. Seshia, D. Song, and R. E. Bryant, “Semantics-aware malware detection,” inIEEE Symposium on Security and Privacy (S&P), 2005

  40. [40]

    Not what you’ve signed up for: Compromising real-world LLM- integrated applications with indirect prompt injection,

    K. Greshake, S. Abdelnabi, S. Mishra, C. Endres, T. Holz, and M. Fritz, “Not what you’ve signed up for: Compromising real-world LLM- integrated applications with indirect prompt injection,” inProceedings of the 16th ACM Workshop on Artificial Intelligence and Security (AISec), 2023

  41. [41]

    InjecAgent: Benchmarking indirect prompt injections in tool-integrated large language model agents,

    Q. Zhan, Z. Liang, Z. Ying, and D. Kang, “InjecAgent: Benchmarking indirect prompt injections in tool-integrated large language model agents,” inFindings of the Association for Computational Linguistics (ACL), 2024

  42. [42]

    AgentDojo: A dynamic environment to evaluate prompt injection attacks and defenses for LLM agents,

    E. Debenedetti, J. Zhang, M. Balunovi ´c, L. Beurer-Kellner, M. Fischer, and F. Tramèr, “AgentDojo: A dynamic environment to evaluate prompt injection attacks and defenses for LLM agents,” inAdvances in Neural Information Processing Systems (NeurIPS) Datasets and Benchmarks Track, 2024

  43. [43]

    Progent: Securing AI Agents with Privilege Control

    T. Shiet al., “Progent: Securing AI agents with privilege control,”arXiv preprint arXiv:2504.11703, 2025

  44. [44]

    AgentSpec: Customizable runtime enforcement for safe and reliable LLM agents,

    H. Wang, C. M. Poskitt, and J. Sun, “AgentSpec: Customizable runtime enforcement for safe and reliable LLM agents,” inProceedings of the 48th International Conference on Software Engineering (ICSE), 2026

  45. [45]

    Contextual agent security: A policy for every purpose,

    L. Tsaiet al., “Contextual agent security: A policy for every purpose,” inWorkshop on Hot Topics in Operating Systems (HotOS), 2025

  46. [46]

    Defeating Prompt Injections by Design

    E. Debenedetti, I. Shumailov, T. Fan, J. Hayes, N. Carlini, D. Fabian, C. Kern, C. Shi, A. Terzis, and F. Tramèr, “Defeating prompt injections by design,”arXiv preprint arXiv:2503.18813, 2025

  47. [47]

    Rtbas: Defending llm agents against prompt injection and privacy leakage,

    P. Zhonget al., “RTBAS: Defending LLM agents against prompt injection and privacy leakage,”arXiv preprint arXiv:2502.08966, 2025

  48. [48]

    Securing AI Agents with Information-Flow Control

    M. Costa, B. Köpfet al., “Securing AI agents with information-flow control,”arXiv preprint arXiv:2505.23643, 2025

  49. [49]

    IsolateGPT: An execution isolation architecture for LLM-based agentic systems,

    Y . Wu, F. Roesner, T. Kohno, N. Zhang, and U. Iqbal, “IsolateGPT: An execution isolation architecture for LLM-based agentic systems,” in Network and Distributed System Security Symposium (NDSS), 2025

  50. [50]

    Taxonomy, evaluation and exploitation of IPI-centric LLM agent defense frameworks,

    Z. Ji, X. Wang, Z. Li, P. Ma, Y . Gao, D. Wu, X. Yan, T. Tian, and S. Wang, “Taxonomy, evaluation and exploitation of IPI-centric LLM agent defense frameworks,”arXiv preprint arXiv:2511.15203, 2025

  51. [51]

    Taming various privilege escalation in LLM-based agent systems: a mandatory access control framework,

    Z. Ji, D. Wu, W. Jiang, P. Ma, Z. Li, Y . Gao, S. Wang, and Y . Li, “Taming various privilege escalation in LLM-based agent systems: A mandatory access control framework,”arXiv preprint arXiv:2601.11893, 2026

  52. [52]

    Optimization-based prompt injection attack to LLM-as-a-judge,

    J. Shi, Z. Yuan, Y . Liu, Y . Huang, P. Zhou, L. Sun, and N. Z. Gong, “Optimization-based prompt injection attack to LLM-as-a-judge,” inProceedings of the ACM SIGSAC Conference on Computer and Communications Security (CCS), 2024

  53. [53]

    Is LLM-as-a-judge robust? inves- tigating universal adversarial attacks on zero-shot LLM assessment,

    V . Raina, A. Liusie, and M. Gales, “Is LLM-as-a-judge robust? inves- tigating universal adversarial attacks on zero-shot LLM assessment,” inProceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024

  54. [54]

    The Attacker Moves Second: Stronger Adaptive Attacks Bypass Defenses Against Llm Jailbreaks and Prompt Injections

    M. Nasr, N. Carliniet al., “The attacker moves second: Stronger adaptive attacks bypass defenses against LLM jailbreaks and prompt injections,” arXiv preprint arXiv:2510.09023, 2025

  55. [55]

    Bypassing LLM guardrails: An empirical analysis of evasion attacks against prompt injection and jailbreak detection systems,

    W. Hackettet al., “Bypassing LLM guardrails: An empirical analysis of evasion attacks against prompt injection and jailbreak detection systems,” inACL Workshop on Large Language Model Security (LLMSEC), 2025

  56. [56]

    An undetectable computer virus,

    D. M. Chess and S. R. White, “An undetectable computer virus,” in Proceedings of the Virus Bulletin Conference, 2000

  57. [57]

    Limits of static analysis for malware detection,

    A. Moser, C. Kruegel, and E. Kirda, “Limits of static analysis for malware detection,” in23rd Annual Computer Security Applications Conference (ACSAC), 2007

  58. [58]

    Learning to Evade Static PE Machine Learning Malware Models via Reinforcement Learning

    H. S. Anderson, A. Kharkar, B. Filar, D. Evans, and P. Roth, “Learning to evade static PE machine learning malware models via reinforcement learning,”arXiv preprint arXiv:1801.08917, 2018

  59. [59]

    Intriguing properties of adversarial ML attacks in the problem space,

    F. Pierazzi, F. Pendlebury, J. Cortellazzi, and L. Cavallaro, “Intriguing properties of adversarial ML attacks in the problem space,” inIEEE Symposium on Security and Privacy (S&P), 2020

  60. [60]

    Dynamic taint analysis for automatic detection, analysis, and signature generation of exploits on commodity software,

    J. Newsome and D. Song, “Dynamic taint analysis for automatic detection, analysis, and signature generation of exploits on commodity software,” inNetwork and Distributed System Security Symposium (NDSS), 2005

  61. [61]

    Dytan: A generic dynamic taint analysis framework,

    J. Clause, W. Li, and A. Orso, “Dytan: A generic dynamic taint analysis framework,” inInternational Symposium on Software Testing and Analysis (ISSTA), 2007

  62. [62]

    libdft: Practical dynamic data flow tracking for commodity systems,

    V . P. Kemerlis, G. Portokalidis, K. Jee, and A. D. Keromytis, “libdft: Practical dynamic data flow tracking for commodity systems,” in ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (VEE), 2012

  63. [63]

    Panorama: Capturing system-wide information flow for malware detection and analysis,

    H. Yin, D. Song, M. Egele, C. Kruegel, and E. Kirda, “Panorama: Capturing system-wide information flow for malware detection and analysis,” inACM Conference on Computer and Communications Security (CCS), 2007

  64. [64]

    TaintDroid: An information-flow tracking system for realtime privacy monitoring on smartphones,

    W. Enck, P. Gilbert, B.-G. Chun, L. P. Cox, J. Jung, P. McDaniel, and A. N. Sheth, “TaintDroid: An information-flow tracking system for realtime privacy monitoring on smartphones,” inUSENIX Symposium on Operating Systems Design and Implementation (OSDI), 2010

  65. [65]

    All you ever wanted to know about dynamic taint analysis and forward symbolic execution (but might have been afraid to ask),

    E. J. Schwartz, T. Avgerinos, and D. Brumley, “All you ever wanted to know about dynamic taint analysis and forward symbolic execution (but might have been afraid to ask),” inIEEE Symposium on Security and Privacy (S&P), 2010

  66. [66]

    A sense of self for unix processes,

    S. Forrest, S. A. Hofmeyr, A. Somayaji, and T. A. Longstaff, “A sense of self for unix processes,” inIEEE Symposium on Security and Privacy (S&P), 1996

  67. [67]

    Intrusion detection using sequences of system calls,

    S. A. Hofmeyr, S. Forrest, and A. Somayaji, “Intrusion detection using sequences of system calls,”Journal of Computer Security, vol. 6, no. 3, pp. 151–180, 1998

  68. [68]

    eAudit: A fast, scalable and deployable audit data collection system,

    R. Sekaret al., “eAudit: A fast, scalable and deployable audit data collection system,” inIEEE Symposium on Security and Privacy (S&P), 2024