arxiv: 2604.03081 · v1 · submitted 2026-04-03 · 💻 cs.CR · cs.AI· cs.CL

Recognition: 2 theorem links

· Lean Theorem

Supply-Chain Poisoning Attacks Against LLM Coding Agent Skill Ecosystems

Gelei Deng, Lei Ma, Leo Yu Zhang, Tongcheng Geng, Yi Liu, Ying Zhang, Yubin Qu, Yuekang Li

Pith reviewed 2026-05-13 19:44 UTC · model grok-4.3

classification 💻 cs.CR cs.AIcs.CL

keywords LLM agentssupply chain attackscoding skillsdocument poisoningimplicit executionadversarial examplessecurity vulnerabilities

0 comments

The pith

Malicious payloads hidden in skill documentation examples can hijack LLM coding agents during normal use.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper investigates supply-chain attacks on LLM-based coding agents that rely on third-party skills from open marketplaces. It proposes Document-Driven Implicit Payload Execution (DDIPE) as a way to embed malicious logic in code examples and configuration templates inside skill documentation. Agents reuse these examples without scrutiny, causing the payloads to execute with system-level privileges. Testing across four frameworks and five models shows success rates between 11.6% and 33.5%, while direct instruction attacks fail against strong defenses. A small percentage of cases evade both static analysis and alignment safeguards.

Core claim

DDIPE embeds malicious payloads within the documentation of skills, specifically in code examples and templates, so that when LLM coding agents incorporate these skills and reuse the examples in their tasks, the malicious actions such as file writes and shell commands are performed implicitly without needing explicit user prompts.

What carries the argument

Document-Driven Implicit Payload Execution (DDIPE), an LLM-driven pipeline that generates adversarial skills by placing payloads in documentation that agents treat as operational code.

If this is right

Skill marketplaces must implement security reviews that include scanning documentation for embedded executable logic.
Existing alignment techniques in LLM agents are insufficient against implicit payload execution from reused examples.
Static analysis tools catch most but not all such attacks, leaving a residual risk of 2.5%.
Responsible disclosure of these vulnerabilities resulted in confirmed issues and partial fixes in affected frameworks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar implicit execution risks could apply to non-coding LLM agents that reuse documentation or examples from plugins.
Developers should consider sandboxing or isolating the execution of any code derived from skill documentation.
Attackers might extend this to target specific MITRE ATT&CK techniques for stealthier persistence in agent ecosystems.

Load-bearing premise

LLM coding agents will reuse code examples and configuration templates from skill documentation as operational directives without additional scrutiny or sanitization.

What would settle it

A test in which an LLM coding agent is given a skill with malicious documentation but is explicitly instructed to ignore all code in documentation and only use verified actions, checking if any payload still executes.

Figures

Figures reproduced from arXiv: 2604.03081 by Gelei Deng, Lei Ma, Leo Yu Zhang, Tongcheng Geng, Yi Liu, Ying Zhang, Yubin Qu, Yuekang Li.

**Figure 1.** Figure 1: A poisoned pptx skill. Left: the highlighted line disguises exfiltration as a routine backup. Right: the referenced script silently uploads documents to an attacker-controlled server. This leaves an open question: can poisoned skills induce coding agents to execute malicious payloads on the host system despite safety alignment and architectural defenses? Answering this question requires addressing three … view at source ↗

**Figure 2.** Figure 2: End-to-end threat scenario for PoisonedSkills. The [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Running examples of Document-Driven Implicit [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: Universal breach case study. This 479-byte payload [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5: Compliance trap case study (Conda environment [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗

read the original abstract

LLM-based coding agents extend their capabilities via third-party agent skills distributed through open marketplaces without mandatory security review. Unlike traditional packages, these skills are executed as operational directives with system-level privileges, so a single malicious skill can compromise the host. Prior work has not examined whether supply-chain attacks can directly hijack an agent's action space, such as file writes, shell commands, and network requests, despite existing safeguards. We introduce Document-Driven Implicit Payload Execution (DDIPE), which embeds malicious logic in code examples and configuration templates within skill documentation. Because agents reuse these examples during normal tasks, the payload executes without explicit prompts. Using an LLM-driven pipeline, we generate 1,070 adversarial skills from 81 seeds across 15 MITRE ATTACK categories. Across four frameworks and five models, DDIPE achieves 11.6% to 33.5% bypass rates, while explicit instruction attacks achieve 0% under strong defenses. Static analysis detects most cases, but 2.5% evade both detection and alignment. Responsible disclosure led to four confirmed vulnerabilities and two fixes.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper shows a plausible new supply-chain vector for LLM coding agents via poisoned skill docs, with concrete bypass numbers across setups, but the tests may not fully isolate implicit reuse from explicit context inclusion.

read the letter

The main point is that this work identifies a supply-chain attack on LLM coding agent skills by hiding payloads inside documentation examples and templates that agents reuse during ordinary tasks. They call it DDIPE and back it with an LLM pipeline that turned 81 seeds into 1,070 adversarial skills spanning 15 attack categories. Tests on four frameworks and five models produced bypass rates of 11.6% to 33.5%, while explicit instruction attacks hit 0% under strong defenses, and 2.5% slipped past both static analysis and alignment. Responsible disclosure produced four confirmed issues and two fixes. That combination of scale and follow-through is the strongest part of the paper and gives it a practical edge over purely conceptual attacks. The approach is new in focusing on implicit execution inside agent skill ecosystems rather than direct prompt injection or package tampering. The empirical spread across multiple frameworks and models adds weight to the claim that this is not an isolated artifact. The soft spot is the core assumption that agents will treat doc examples as operational directives without extra prompting or sanitization. The abstract states this reuse occurs during normal tasks, yet the available description does not detail the exact task prompts, documentation access method, or controls that would rule out the docs being supplied in the agent context. If that happened, the measured rates would not cleanly demonstrate the claimed supply-chain vector. The 2.5% evasion figure is low enough to be noteworthy but would benefit from more on how it was isolated and whether false-positive rates were checked. This paper is for people working on LLM agent security and supply-chain defenses in development tools. It deserves a serious referee because the attack surface is timely and the empirical results are specific enough to be checked and extended, even with the methodology gaps.

Referee Report

2 major / 1 minor

Summary. The paper claims that third-party skills for LLM coding agents can be poisoned via Document-Driven Implicit Payload Execution (DDIPE), which embeds malicious payloads in code examples and configuration templates inside skill documentation. Because agents reuse these examples during normal tasks, the payloads execute implicitly (e.g., file writes, shell commands, network requests) without explicit user prompts, bypassing alignment and explicit-instruction defenses. The authors generate 1,070 adversarial skills from 81 seeds across 15 MITRE ATT&CK categories and report bypass rates of 11.6–33.5% across four frameworks and five models (versus 0% for explicit attacks under strong defenses), with 2.5% evading both static analysis and alignment; responsible disclosure yielded four confirmed vulnerabilities and two fixes.

Significance. If the empirical results hold, the work identifies a previously unexamined supply-chain attack surface in open LLM agent skill marketplaces, where skills run with system-level privileges. The concrete bypass measurements, scale of the generated attack corpus, contrast with explicit-instruction baselines, and responsible-disclosure outcomes provide actionable evidence that current safeguards are insufficient. The use of MITRE ATT&CK categories for attack diversity is a methodological strength.

major comments (2)

[Evaluation] Evaluation section: The description of task prompts, documentation access mechanism (whether docs are explicitly injected into the agent context or accessed autonomously), and controls isolating implicit reuse from explicit references is insufficient. Without these details, the reported 11.6–33.5% bypass rates cannot be confirmed to demonstrate the claimed supply-chain vector rather than an artifact of the experimental setup.
[Results] Results: The 2.5% figure for cases evading both static analysis and alignment is presented without per-framework/model breakdown, error bars, or statistical tests. This quantity is load-bearing for the claim that existing defenses are inadequate.

minor comments (1)

[Abstract] The abstract and results sections name neither the four frameworks nor the five models; explicit enumeration would improve reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment point by point below and commit to revisions that strengthen the clarity and evidentiary basis of our claims.

read point-by-point responses

Referee: [Evaluation] Evaluation section: The description of task prompts, documentation access mechanism (whether docs are explicitly injected into the agent context or accessed autonomously), and controls isolating implicit reuse from explicit references is insufficient. Without these details, the reported 11.6–33.5% bypass rates cannot be confirmed to demonstrate the claimed supply-chain vector rather than an artifact of the experimental setup.

Authors: We agree that the Evaluation section requires additional detail to allow readers to fully assess the experimental setup. In the revised manuscript we will add a dedicated subsection that (1) lists the exact task prompt templates used, (2) clarifies that skill documentation is retrieved autonomously by the agent during normal task execution rather than being explicitly injected by the user, and (3) describes the control conditions (baseline runs with clean documentation and runs that explicitly reference the malicious content) used to isolate implicit payload reuse. These additions will include representative prompt examples and a diagram of the documentation-access flow. revision: yes
Referee: [Results] Results: The 2.5% figure for cases evading both static analysis and alignment is presented without per-framework/model breakdown, error bars, or statistical tests. This quantity is load-bearing for the claim that existing defenses are inadequate.

Authors: We acknowledge that the 2.5% aggregate evasion rate would be more robust if accompanied by granular statistics. In the revision we will expand the Results section to report the evasion rate broken down by framework and model, include binomial confidence intervals as error bars, and apply appropriate statistical tests (e.g., proportion tests against the explicit-attack baseline) to support the claim that current defenses remain inadequate. If any cell counts are too small for reliable per-model inference we will note this limitation explicitly. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical attack rates are measured outputs, not constructed quantities

full rationale

The paper is an empirical security study that measures bypass rates (11.6–33.5 %) for the DDIPE attack across frameworks and models. No equations, fitted parameters, or derivation steps appear in the text. The central premise that agents reuse documentation examples is stated as an observed behavioral assumption and is not derived from or reduced to any prior result by the authors. Self-citations, if present, are not load-bearing for the reported attack success rates or the 2.5 % evasion figure, which are direct experimental measurements rather than renamings or self-definitional constructs. The work is therefore self-contained against external benchmarks with no reduction of outputs to inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on domain assumptions about agent behavior with documentation and the effectiveness of the adversarial generation process; no free parameters or invented entities are introduced.

axioms (1)

domain assumption LLM coding agents reuse code examples and configuration templates from skill documentation as executable directives during normal operation.
This premise is required for the implicit payload to execute without explicit prompts.

pith-pipeline@v0.9.0 · 5508 in / 1342 out tokens · 45128 ms · 2026-05-13T19:44:08.759002+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear
Document-Driven Implicit Payload Execution (DDIPE), which embeds malicious logic in code examples and configuration templates within skill documentation
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear
Across four frameworks and five models, DDIPE achieves 11.6% to 33.5% bypass rates

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

ClawLess: A Security Model of AI Agents
cs.CR 2026-04 unverdicted novelty 5.0

ClawLess introduces a formal fine-grained security model for AI agents with runtime-adaptive policies enforced via user-space kernel and BPF syscall interception.
Security Attack and Defense Strategies for Autonomous Agent Frameworks: A Layered Review with OpenClaw as a Case Study
cs.CR 2026-04 conditional novelty 4.0

The survey organizes security threats and defenses in autonomous LLM agents into four layers and identifies that risks can propagate across layers from inputs to ecosystem impacts.

Reference graph

Works this paper leans on

54 extracted references · 54 canonical work pages · cited by 2 Pith papers · 11 internal anchors

[1]

Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Flo- rencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shya- mal Anadkat, et al. 2023. Gpt-4 technical report.arXiv preprint arXiv:2303.08774 (2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023
[2]

Anonymous. 2026. Poisoning Agent Skills Replication Package. https://sites. google.com/view/poisoning-agent-skills

work page 2026
[3]

Anthropic. 2024. The Claude 3 model family: Opus, Sonnet, haiku. Anthropic, tech. rep.(2024). https://www-cdn.anthropic.com/ de8ba9b01c9ab7cbabf5c33b80b7bbc618857627/Model_Card_Claude_3.pdf

work page 2024
[4]

Anthropic. 2024. Equipping Agents for the Real World with Agent Skills. https: //claude.com/blog/equipping-agents-for-the-real-world-with-agent-skills. Of- ficial blog post introducing the Agent Skills framework and the SKILL.md spec- ification

work page 2024
[5]

Anthropic. 2024. Model Context Protocol: Standardizing Context for AI Agents. https://www.anthropic.com/news/model-context-protocol. Accessed: 2025-02- 20

work page 2024
[6]

Ron Artstein and Massimo Poesio. 2008. Survey article: Inter-coder agreement for computational linguistics.Computational linguistics34, 4 (2008), 555–596

work page 2008
[7]

Ralf Bender. 2001. Calculating confidence intervals for the number needed to treat.Controlled clinical trials22, 2 (2001), 102–110

work page 2001
[8]

Manish Bhatt, Sahana Chennabasappa, Yue Li, Cyrus Nikolaidis, Daniel Song, Shengye Wan, Faizan Ahmad, Cornelius Aschermann, Yaohui Chen, Dhaval Kapil, et al. 2024. Cyberseceval 2: A wide-ranging cybersecurity evaluation suite for large language models.arXiv preprint arXiv:2404.13161(2024)

work page arXiv 2024
[9]

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language models are few-shot learners.Advances in neural information processing systems33 (2020), 1877–1901

work page 2020
[10]

Nicholas Carlini, Matthew Jagielski, Christopher A Choquette-Choo, Daniel Paleka, Will Pearce, Hyrum Anderson, Andreas Terzis, Kurt Thomas, and Flo- rian Tramèr. 2024. Poisoning web-scale training datasets is practical. In2024 IEEE Symposium on Security and Privacy (SP). IEEE, 407–425

work page 2024
[11]

Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde De Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, et al. 2021. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374(2021)

work page internal anchor Pith review Pith/arXiv arXiv 2021
[12]

Aviv Donenfeld and Eran Oded. 2026. Caught in the Hook: RCE and API Token Exfiltration Through Claude Code Project Files (CVE-2025-59536). Check Point Research. https://research.checkpoint.com/2026/rce-and-api- token-exfiltration-through-claude-code-project-files-cve-2025-59536/

work page 2026
[13]

Sleeper

David Gewirtz. 2010. The Threat Of" Sleeper" Software.Journal of Counterter- rorism & Homeland Security International16, 4 (2010)

work page 2010
[14]

Fabrizio Gilardi, Meysam Alizadeh, and Maël Kubli. 2023. ChatGPT outperforms crowd workers for text-annotation tasks.Proceedings of the National Academy of Sciences120, 30 (2023), e2305016120

work page 2023
[15]

GitHub. 2026. GitHub Advisory Database. https://github.com/advisories. [On- line; accessed March-2026]

work page 2026
[16]

Team Glm, Aohan Zeng, Bin Xu, Bowen Wang, Chenhui Zhang, Da Yin, Dan Zhang, Diego Rojas, Guanyu Feng, Hanlin Zhao, et al. 2024. Chatglm: A fam- ily of large language models from glm-130b to glm-4 all tools.arXiv preprint arXiv:2406.12793(2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024
[17]

Kai Greshake, Sahar Abdelnabi, Shailesh Mishra, Christoph Endres, Thorsten Holz, and Mario Fritz. 2023. Not what you’ve signed up for: Compromising real- world llm-integrated applications with indirect prompt injection. InProceedings of the 16th ACM workshop on artificial intelligence and security. 79–90

work page 2023
[18]

Tianyu Gu, Brendan Dolan-Gavitt, and Siddharth Garg. 2017. Badnets: Identify- ing vulnerabilities in the machine learning model supply chain.arXiv preprint arXiv:1708.06733(2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017
[19]

Wenyi Hong, Wenmeng Yu, Xiaotao Gu, Guo Wang, Guobing Gan, Haomiao Tang, Jiale Cheng, Ji Qi, Junhui Ji, Lihang Pan, et al. 2025. Glm-4.5 v and glm-4.1 v-thinking: Towards versatile multimodal reasoning with scalable reinforcement learning.arXiv preprint arXiv:2507.01006(2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025
[20]

Aaruni Kaushik. 2024. Predefined Software Environment Runtimes as a Measure for Reproducibility. InInternational Congress on Mathematical Software. Springer, 245–253

work page 2024
[21]

Yi Liu, Gelei Deng, Yuekang Li, Kailong Wang, Zihao Wang, Xiaofeng Wang, Tianwei Zhang, Yepang Liu, Haoyu Wang, Yan Zheng, et al. 2023. Prompt injec- tion attack against llm-integrated applications.arXiv preprint arXiv:2306.05499 (2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023
[22]

Yang Liu, Dan Iter, Yichong Xu, Shuohang Wang, Ruochen Xu, and Chenguang Zhu. 2023. G-eval: NLG evaluation using gpt-4 with better human alignment. InProceedings of the 2023 conference on empirical methods in natural language processing. 2511–2522

work page 2023
[23]

Yi Liu, Weizhe Wang, Ruitao Feng, Yao Zhang, Guangquan Xu, Gelei Deng, Yuekang Li, and Leo Zhang. 2026. Agent Skills in the Wild: An Empirical Study of Security Vulnerabilities at Scale.arXiv preprint arXiv:2601.10338(2026)

work page arXiv 2026
[24]

Yuxing Ma, Audris Mockus, Russel Zaretzki, Randy Bradley, and Bogdan Bich- escu. 2020. A methodology for analyzing uptake of software technologies among developers.IEEE transactions on software engineering48, 2 (2020), 485–501

work page 2020
[25]

Gráinne McLoughlin, Máté Gyurkovics, Jason Palmer, and Scott Makeig. 2022. Midfrontal theta activity in psychiatric illness: an index of cognitive vulnerabil- ities across disorders.Biological psychiatry91, 2 (2022), 173–182

work page 2022
[26]

MiniMax. 2024. MiniMax Large Language Model API Documentation. https: //api.minimax.chat/. [Online; accessed March-2026]

work page 2024
[27]

National Institute of Standards and Technology (NIST). 2026. National Vulnera- bility Database (NVD). https://nvd.nist.gov/. [Online; accessed March-2026]

work page 2026
[28]

NSFOCUS Security Research Team. 2026. Interpretation of Recent Ecosystem Security Events: From RCE Vulnerabilities to Skill Supply Chain Poisoning. https://blog.nsfocus.net/openclaw/. [Online; accessed March-2026]

work page 2026
[29]

Marc Ohm, Henrik Plate, Arnold Sykosch, and Michael Meier. 2020. Backstab- ber’s knife collection: A review of open source software supply chain attacks. In International Conference on Detection of Intrusions and Malware, and Vulnerabil- ity Assessment. Springer, 23–43

work page 2020
[30]

OpenRouter. 2026. OpenRouter Model Rankings. https://openrouter.ai/rankings. Accessed: 2026-03-15

work page 2026
[31]

OWASP Foundation. 2023. OWASP Top 10 for Large Language Model Ap- plications. https://owasp.org/www-project-top-10-for-large-language-model- applications/

work page 2023
[32]

Ethan Perez, Saffron Huang, Francis Song, Trevor Cai, Roman Ring, John Aslanides, Amelia Glaese, Nat McAleese, and Geoffrey Irving. 2022. Red teaming language models with language models.arXiv preprint arXiv:2202.03286(2022)

work page internal anchor Pith review Pith/arXiv arXiv 2022
[33]

Yujia Qin, Shengding Hu, Yankai Lin, Weize Chen, Ning Ding, Ganqu Cui, Zheni Zeng, Xuanhe Zhou, Yufei Huang, Chaojun Xiao, et al. 2024. Tool learning with foundation models.Comput. Surveys57, 4 (2024), 1–40

work page 2024
[34]

Yubin Qu, Song Huang, Long Li, Peng Nie, and Yongming Yao. 2025. Beyond Intentions: A Critical Survey of Misalignment in LLMs.Computers, Materials & Continua85, 1 (2025)

work page 2025
[35]

Yubin Qu, Song Huang, and Peng Nie. 2025. A review of backdoor attacks and defenses in code large language models: Implications for security measures.In- formation and Software Technology(2025), 107707

work page 2025
[36]

Johann Rehberger. 2024. Exfiltrating Personal Data from ChatGPT via Markdown Images (Log-To-Leak). https://embracethered.com/blog/posts/2023/ chatgpt-webpilot-data-exfil-via-markdown-injection/. Online; accessed 2024

work page 2024
[37]

David Schmotz, Luca Beurer-Kellner, Sahar Abdelnabi, and Maksym An- driushchenko. 2026. Skill-Inject: Measuring Agent Vulnerability to Skill File Attacks.arXiv preprint arXiv:2602.20156(2026)

work page arXiv 2026
[38]

SkillsMP. 2026. SkillsMP: The Agent Skills Marketplace. https://skillsmp.com/. [Online; accessed 26-March-2026]

work page 2026
[39]

Jonathan Sneh, Ruomei Yan, Jialin Yu, Philip Torr, Yarin Gal, Sunando Sengupta, Eric Sommerlade, Alasdair Paren, and Adel Bibi. 2025. Tooltweak: An attack on tool selection in llm-based agents.arXiv preprint arXiv:2510.02554(2025)

work page arXiv 2025
[40]

2018.MITRE ATT&CK®: Design and Philosophy

Blake I Strom, Andy Applebaum, Doug P Miller, Kathryn C Nickels, Adam G Pen- nington, and Corbin B Thomas. 2018.MITRE ATT&CK®: Design and Philosophy. Technical Report. The MITRE Corporation. https://attack.mitre.org/

work page 2018
[41]

Carol Taylor and Jim Alves-Foss. 2005. Diversity as a computer defense mecha- nism. InProceedings of the 2005 workshop on New security paradigms. 11–14

work page 2005
[42]

Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M Dai, Anja Hauth, Katie Millican, et al. 2023. Gemini: a family of highly capable multimodal models.arXiv preprint arXiv:2312.11805(2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023
[43]

Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yas- mine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhos- ale, et al. 2023. Llama 2: Open foundation and fine-tuned chat models.arXiv Conference’17, July 2017, Washington, DC, USA Yubin Qu, Yi Liu, Tongcheng Geng, Gelei Deng, Yuekang Li, Leo Zhang, Ying Zhang,...

work page internal anchor Pith review Pith/arXiv arXiv 2023
[44]

Lei Wang, Chen Ma, Xueyang Feng, Zeyu Zhang, Hao Yang, Jingsen Zhang, Zhiyuan Chen, Jiakai Tang, Xu Chen, Yankai Lin, et al. 2024. A survey on large language model based autonomous agents.Frontiers of Computer Science18, 6 (2024), 186345

work page 2024
[45]

Xingyao Wang, Boxuan Li, Yufan Song, Frank F Xu, Xiangru Tang, Mingchen Zhuge, Jiayi Pan, Yueqi Song, Bowen Li, Jaskirat Singh, et al. 2024. Openhands: An open platform for ai software developers as generalist agents.arXiv preprint arXiv:2407.16741(2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024
[46]

Yueming Wu, Deqing Zou, Shihan Dou, Wei Yang, Duo Xu, and Hai Jin. 2022. VulCNN: An image-inspired scalable vulnerability detection system. InProceed- ings of the 44th International Conference on Software Engineering. 2365–2376

work page 2022
[47]

Zhiheng Xi, Wenxiang Chen, Xin Guo, Wei He, Yiwen Ding, Boyang Hong, Ming Zhang, Junzhe Wang, Senjie Jin, Enyu Zhou, et al. 2025. The rise and potential of large language model based agents: A survey.Science China Information Sciences 68, 2 (2025), 121101

work page 2025
[48]

Hao Yang, Shuyuan Lin, Lin Cheng, Yang Lu, and Hanzi Wang. 2022. Scinet: Se- mantic cue infusion network for lane detection. In2022 IEEE International Con- ference on Image Processing (ICIP). IEEE, 1811–1815

work page 2022
[49]

2022.{DeepDi}: Learning a relational graph convolutional network model on instructions for fast and ac- curate disassembly

Sheng Yu, Yu Qu, Xunchao Hu, and Heng Yin. 2022.{DeepDi}: Learning a relational graph convolutional network model on instructions for fast and ac- curate disassembly. In31st USENIX Security Symposium (USENIX Security 22). 2709–2725

work page 2022
[50]

Tongxin Yuan, Zhiwei He, Lingzhong Dong, Yiming Wang, Ruijie Zhao, Tian Xia, Lizhen Xu, Binglin Zhou, Fangqi Li, Zhuosheng Zhang, et al. 2024. R-judge: Benchmarking safety risk awareness for llm agents. InFindings of the Association for Computational Linguistics: EMNLP 2024. 1467–1490

work page 2024
[51]

Qiusi Zhan, Zhixiang Liang, Zifan Ying, and Daniel Kang. 2024. Injecagent: Benchmarking indirect prompt injections in tool-integrated large language model agents. InFindings of the Association for Computational Linguistics: ACL

work page 2024
[52]

Lianmin Zheng, Wei-Lin Chiang, Ying Sheng, Siyuan Zhuang, Zhanghao Wu, Yonghao Zhuang, Zi Lin, Zhuohan Li, Dacheng Li, Eric Xing, et al. 2023. Judging llm-as-a-judge with mt-bench and chatbot arena.Advances in neural information processing systems36 (2023), 46595–46623

work page 2023
[53]

Andy Zou, Zifan Wang, Nicholas Carlini, Milad Nasr, J Zico Kolter, and Matt Fredrikson. 2023. Universal and transferable adversarial attacks on aligned lan- guage models.arXiv preprint arXiv:2307.15043(2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023
[54]

2025.{PoisonedRAG}: Knowledge corruption attacks to{Retrieval-Augmented}generation of large language models

Wei Zou, Runpeng Geng, Binghui Wang, and Jinyuan Jia. 2025.{PoisonedRAG}: Knowledge corruption attacks to{Retrieval-Augmented}generation of large language models. In34th USENIX Security Symposium (USENIX Security 25). 3827–3844

work page 2025