arxiv: 2604.27202 · v1 · submitted 2026-04-29 · 💻 cs.CR

Recognition: unknown

Indirect Prompt Injection in the Wild: An Empirical Study of Prevalence, Techniques, and Objectives

Bhupendra Acharya, Giancarlo Pellegrino, Soheil Khodayari, Xuenan Zhang

Authors on Pith no claims yet

Pith reviewed 2026-05-07 08:51 UTC · model grok-4.3

classification 💻 cs.CR

keywords indirect prompt injectionLLM securityweb vulnerabilitiesempirical analysisAI complianceprompt attacksweb crawlingmachine-readable instructions

0 comments

The pith

Webpages already contain thousands of hidden instructions aimed at manipulating AI systems that read them.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines how common indirect prompt injections are on the actual web by scanning over a billion URLs. It finds over fifteen thousand examples on nearly twelve thousand pages, mostly tucked away in code that humans do not see. These injections aim at machines like crawlers and agents rather than people, and tests show some AI models follow the hidden instructions at low but measurable rates. This matters because more systems now use AI models to read and act on web content, turning everyday sites into potential input channels. If the findings hold, current web practices already interfere with automated AI use in non-trivial ways.

Core claim

By crawling 1.2 billion URLs across 24.8 million hosts, the study locates 15,300 validated indirect prompt injections on 11,700 pages. Most of these target machine readers through non-rendered HTML sections such as headers and comments, with only a minority visible to humans. The injections pursue varied goals including disruption, reputation control, content protection, and bot detection. Controlled tests on 13 models with different input formats show that compliance reaches as high as 8 percent for smaller models when given plain text, though structured formats lower the rate by keeping layout cues intact.

What carries the argument

Large-scale URL crawling combined with pattern detection and follow-up controlled experiments on model compliance.

Load-bearing premise

The detected patterns are actually meant to change how AI models behave rather than being ordinary code or mistaken matches.

What would settle it

A re-analysis of the same set of URLs that finds the 15,300 instances are mostly false positives or a new round of experiments that shows zero compliance on all tested models and formats.

Figures

Figures reproduced from arXiv: 2604.27202 by Bhupendra Acharya, Giancarlo Pellegrino, Soheil Khodayari, Xuenan Zhang.

**Figure 1.** Figure 1: Distribution of prompt template frequencies. Tem view at source ↗

**Figure 2.** Figure 2: Role of page representation. are near zero on all non-text representations. This indicates that representation and model capacity interact: simpler representations reopen the attack surface most strongly for weaker models, whereas stronger models benefit more consistently from structural cues present in HTML, raw responses, and snapshots view at source ↗

**Figure 3.** Figure 3: Distribution of prompt injections per page and host. view at source ↗

**Figure 4.** Figure 4: Spatial distribution of visible prompt injections view at source ↗

**Figure 5.** Figure 5: Top technique combinations for hiding prompt view at source ↗

**Figure 6.** Figure 6: Prompt injection rate across domain topic categories and Tranco rank buckets. view at source ↗

**Figure 8.** Figure 8: Cumulative distribution function (CDF) of Tranco view at source ↗

read the original abstract

As LLMs are increasingly integrated into systems that browse, retrieve, summarize, and act on web content, webpages have become an untrusted input vector for downstream model behavior. This enables site owners, contributors, and adversaries to embed instructions directly in web resources, i.e., indirect prompt injections. While prior work demonstrates such attacks in controlled settings, their prevalence, deployment, and real-world impact remain unclear. We present one of the first large-scale empirical analyses of indirect prompt injections in webpages and HTTP responses. Analyzing 1.2B URLs from 24.8M hosts, we identify 15.3K validated instances across 11.7K pages. These are not isolated cases: a small number of recurring templates account for most cases. We characterize their objectives, delivery mechanisms, visibility, persistence, and impact, revealing a heterogeneous ecosystem spanning disruptive prompts, reputation manipulation, content-protection directives, and AI-bot detection, targeting systems such as crawlers, search pipelines, customer-support agents, and hiring workflows. A key finding is that most instructions target machines rather than humans: about 70% appear in non-rendered HTML (e.g., headers, comments, metadata), and many visible cases are hidden via rendering techniques. To assess practical risk, we run 5,200 controlled experiments across 13 models and four webpage representations. Our results show compliance is limited but non-negligible, reaching up to 8% for smaller models on plain-text inputs, while structured representations reduce compliance by preserving structural cues. Overall, prompt-based interference is already present in the web ecosystem and represents a growing source of tension between LLM-driven automation and the sites it consumes.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper offers the first large-scale view of indirect prompt injections in the wild, but its detection method needs more scrutiny to support the prevalence claims.

read the letter

This paper's main takeaway is a first-of-its-kind crawl that turns up 15.3K indirect prompt injection instances across 1.2B URLs. It also runs compliance tests on 13 models to show that real risk exists but stays limited in practice. That combination of scale and controlled follow-up is what makes the work worth reading. They map out objectives like crawler disruption and content protection, note that 70% sit in non-rendered HTML, and find that structured page representations cut compliance rates. The empirical framing moves the discussion past pure lab attacks, and the recurring-template observation gives a concrete sense of how these things actually appear on the web. The detection and validation steps are the soft spot. The abstract and stress-test note both flag that the 15.3K count rests on pattern matching without enough public detail on false-positive controls or how benign meta tags and comments were filtered. If many of those instances are noise, the prevalence numbers and the downstream experiments lose force. Crawling biases and selection criteria are also thin in the summary, which leaves the central claims only partly anchored. This is for people who build or study LLM systems that ingest web content, especially security and automation teams. Readers who want quantified evidence of an existing threat will get something usable from the categories and compliance rates. It deserves a serious referee because the topic is timely and the data volume is new, even though the methods section will need tightening before publication. I would send it to review with a clear request to document the validation pipeline and any manual checks they performed.

Referee Report

2 major / 2 minor

Summary. The paper presents a large-scale empirical study of indirect prompt injections in webpages and HTTP responses. Scanning 1.2B URLs from 24.8M hosts yields 15.3K validated instances across 11.7K pages; the authors characterize objectives, delivery mechanisms, visibility (noting ~70% in non-rendered HTML), persistence, and impact, then evaluate practical risk via 5,200 controlled experiments across 13 models and four webpage representations, reporting limited but non-negligible compliance (up to 8% for smaller models on plain-text inputs).

Significance. If the detections and experiments hold, the work supplies one of the first quantitative characterizations of indirect prompt injection prevalence and risk in the open web, a timely contribution given growing LLM integration with web content. The scale of the crawl and the multi-model compliance tests are clear strengths that could guide both defensive research and system design for crawlers, agents, and retrieval pipelines.

major comments (2)

Abstract and prevalence results: the central claim of 15.3K validated instances is load-bearing, yet the validation criteria used to confirm these are genuine LLM-targeted instructions (rather than benign non-rendered HTML such as meta tags, comments, or SEO directives) are not specified. Because ~70% of reported instances fall in non-rendered HTML and a small number of templates dominate, explicit controls for context and false-positive mitigation are required to substantiate the ecosystem characterization and downstream compliance experiments.
Experimental evaluation section: the 5,200 experiments report compliance rates up to 8%, but the precise definitions of the four webpage representations, the prompt templates, and the binary compliance metric are not detailed enough to assess whether the results generalize to deployed LLM systems or reflect realistic input conditions.

minor comments (2)

The abstract and introduction could include a brief statement of limitations on crawling coverage and validation precision.
Tables or figures summarizing the distribution of objectives and templates would benefit from explicit sample sizes and confidence intervals.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their positive evaluation of the work's significance and for the constructive feedback. We address each major comment below, clarifying our approach and indicating the revisions made to improve transparency and reproducibility.

read point-by-point responses

Referee: Abstract and prevalence results: the central claim of 15.3K validated instances is load-bearing, yet the validation criteria used to confirm these are genuine LLM-targeted instructions (rather than benign non-rendered HTML such as meta tags, comments, or SEO directives) are not specified. Because ~70% of reported instances fall in non-rendered HTML and a small number of templates dominate, explicit controls for context and false-positive mitigation are required to substantiate the ecosystem characterization and downstream compliance experiments.

Authors: We agree that the validation criteria require more explicit description to support the prevalence claims, particularly given the prevalence of non-rendered content. The original manuscript outlined a multi-stage detection and validation process in Section 3, but we acknowledge the criteria were not detailed sufficiently. In the revised version, we have added a dedicated subsection specifying: (1) automated detection rules targeting imperative, AI-directed language (e.g., overrides of prior instructions or role assignments) while excluding standard benign patterns such as meta tags, HTML comments, and SEO directives via a curated exclusion list; (2) context-aware checks to confirm the instruction is not legitimate HTML boilerplate; and (3) manual validation on a random sample of 500 instances by two independent annotators (Cohen's kappa = 0.89), with an estimated false-positive rate of under 5%. These additions directly address concerns about false positives in non-rendered HTML and strengthen the foundation for the compliance experiments. revision: yes
Referee: Experimental evaluation section: the 5,200 experiments report compliance rates up to 8%, but the precise definitions of the four webpage representations, the prompt templates, and the binary compliance metric are not detailed enough to assess whether the results generalize to deployed LLM systems or reflect realistic input conditions.

Authors: We concur that greater specificity on the experimental setup is needed for reproducibility and to evaluate applicability to real deployments. The original text described the four representations and compliance tests at a summary level in Section 5. In the revision, we have expanded this section to define: (1) the four webpage representations explicitly (plain-text extraction, full HTML source, rendered DOM text, and structured JSON with preserved hierarchy); (2) the complete prompt templates, including system prompts, user queries, and exact insertion points for webpage content across the 13 models; and (3) the binary compliance metric, operationalized as whether the model output executes the injected directive (e.g., performs the specified action or includes the requested content), with examples of compliant vs. non-compliant cases and inter-annotator validation on 200 samples. We also added a limitations paragraph discussing how these controlled conditions relate to (and may differ from) production LLM systems. These changes enable readers to better assess generalizability. revision: yes

Circularity Check

0 steps flagged

No circularity: direct empirical measurement with independent validation and experiments

full rationale

The paper is a large-scale observational study that scans 1.2B URLs, validates 15.3K instances via manual or rule-based checks, characterizes patterns, and runs separate 5,200-model compliance experiments. No equations, fitted parameters, derivations, or predictions appear in the provided text. Claims rest on raw counts, template recurrence, and controlled test outcomes rather than any self-definition, renaming, or self-citation chain that reduces results to inputs by construction. The derivation chain is therefore self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The central claims rest on empirical data collection and validation rather than mathematical axioms or new entities. Implicit assumptions include the representativeness of the crawled URLs and the accuracy of pattern-based detection for prompt injections.

pith-pipeline@v0.9.0 · 5618 in / 1220 out tokens · 56202 ms · 2026-05-07T08:51:24.021407+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

102 extracted references · 7 canonical work pages

[1]

https://search.censys.io

[n.d.].Censys Search. https://search.censys.io
[2]

[n.d.].Chrome DevTools Protocol.https://chromedevtools.github.io/devtools- protocol/
[3]

https://privacysandbox.google.com/private- advertising/topics

[n.d.].Chrome Topics API. https://privacysandbox.google.com/private- advertising/topics
[4]

[n.d.].Common Crawl web crawl data.https://commoncrawl.org/
[5]

https://developer.mozilla.org/en-US/docs/Web/CS S/Reference/Properties/clip-path

[n.d.].CSS clip-path property. https://developer.mozilla.org/en-US/docs/Web/CS S/Reference/Properties/clip-path
[6]

https://developer.mozilla.org/en-US/docs/Web/CSS/ display

[n.d.].CSS display property. https://developer.mozilla.org/en-US/docs/Web/CSS/ display
[7]

https://developer.mozilla.org/en-US/docs/Web/CSS/ opacity

[n.d.].CSS opacity property. https://developer.mozilla.org/en-US/docs/Web/CSS/ opacity
[8]

https://developer.mozilla.org/en-US/docs/Web/ CSS/Reference/Properties/text-indent

[n.d.].CSS text-indent property. https://developer.mozilla.org/en-US/docs/Web/ CSS/Reference/Properties/text-indent
[9]

https://developer.mozilla.org/en-US/docs/Web/CS S/visibility

[n.d.].CSS visibility property. https://developer.mozilla.org/en-US/docs/Web/CS S/visibility
[10]

https://developer.mozilla.org/en-US/docs/Web/CSS/ z-index

[n.d.].CSS z-index property. https://developer.mozilla.org/en-US/docs/Web/CSS/ z-index
[11]

https://scikit-learn.org/stable/modules/generated/s klearn.cluster.DBSCAN.html

[n.d.].DBSCAN Clustering. https://scikit-learn.org/stable/modules/generated/s klearn.cluster.DBSCAN.html
[12]

[n.d.].Document: elementFromPoint() method.https://developer.mozilla.org/en- US/docs/Web/API/Document/elementFromPoint
[13]

[n.d.].DOM stacking context.https://developer.mozilla.org/en-US/docs/Web/CS S/Guides/Positioned_layout/Stacking_context
[14]

[n.d.].Headless Chromium.https://chromium.googlesource.com/chromium/src /+/lkgr/headless/README.md
[15]

https://www.w3.org/wiki/Hit_Testing

[n.d.].Hit Testing. https://www.w3.org/wiki/Hit_Testing
[16]

[n.d.].HTML hidden global attribute.https://developer.mozilla.org/en- US/docs/Web/HTML/Reference/Global_attributes/hidden
[17]

[n.d.].Llama Prompt Guard 2.https://www.llama.com/docs/model-cards-and- prompt-formats/prompt-guard/
[18]

[n.d.].MDN color contrast guide.https://developer.mozilla.org/en-US/docs/Web/ Accessibility/Guides/Understanding_WCAG/Perceivable/Color_contrast
[19]

[n.d.].Playwright browser automation framework.https://playwright.dev/
[20]

[n.d.].Playwright SnapshotForAI() Method.https://github.com/microsoft/playw right-python/issues/2867
[21]

https://huggingface.co/sen tence-transformers/all-MiniLM-L6-v2

[n.d.].Sentence transformer all-MiniLM-L6-v2 model. https://huggingface.co/sen tence-transformers/all-MiniLM-L6-v2
[22]

https://www.shodan.io/

[n.d.].Shodan. https://www.shodan.io/
[23]

https://developers.google.com/sear ch/docs/essentials/spam-policies#hidden-text-and-links

[n.d.].Spam policies for Google web search. https://developers.google.com/sear ch/docs/essentials/spam-policies#hidden-text-and-links
[24]

https://gomakethings.com/the- many-ways-to-hide-things-in-the-dom/

[n.d.].The many ways to hide things in the DOM. https://gomakethings.com/the- many-ways-to-hide-things-in-the-dom/
[25]

[n.d.].Viewport concepts.https://developer.mozilla.org/en-US/docs/Web/CSS/Vi ewport_concepts
[26]

https://www.w3.org/TR/CSS2/visur en.html

[n.d.].W3C DOM visual formatting model. https://www.w3.org/TR/CSS2/visur en.html
[27]

[n.d.].WCAG definition of relative luminance.https://www.w3.org/WAI/GL/wi ki/Relative_luminance
[28]

mozilla.org/en-US/docs/Web/Accessibility/Guides/Colors_and_Luminance

[n.d.].Web accessibility: understanding colors and luminance.https://developer. mozilla.org/en-US/docs/Web/Accessibility/Guides/Colors_and_Luminance
[29]

[n.d.].Web Content Accessibility Guidelines (WCAG21).https://www.w3.org/TR/ WCAG21/
[30]

https://brave.com/blog/unseeable-prompt-injections/

2024.Microsoft Copilot: From Prompt Injection to Exfiltration of Personal Informa- tion. https://brave.com/blog/unseeable-prompt-injections/

2024
[31]

https://pypi.org/project/beautifulsoup4/

2025.BeautifulSoup4 Library. https://pypi.org/project/beautifulsoup4/

2025
[32]

Daniel Ayzenshteyn, Roy Weiss, and Yisroel Mirsky. 2025. Cloak, Honey, Trap: Proactive Defenses Against LLM Agents. InUSENIX Security Symposium

2025
[33]

Bogdan Calin. 2025. Prompt Injection Attacks on Applications That Use LLMs. (2025). https://www.invicti.com/white-papers/prompt-injection-attacks-on- llm-applications-ebook

2025
[34]

Artem Chaikin and Shivan Kaul Sahib. 2025. Agentic Browser Security: Indirect Prompt Injection in Perplexity Comet. (2025). https://brave.com/blog/comet- prompt-injection/

2025
[35]

Artem Chaikin and Shivan Kaul Sahib. 2025. Unseeable prompt injections in screenshots: more vulnerabilities in Comet and other AI browsers. (2025). https: //brave.com/blog/unseeable-prompt-injections/

2025
[36]

Patrick Chao, Alexander Robey, Edgar Dobriban, Hamed Hassani, George J Pappas, and Eric Wong. 2025. Jailbreaking black box large language models in twenty queries. InIEEE Conference on Secure and Trustworthy Machine Learning (SaTML)

2025
[37]

Sizhe Chen, Julien Piet, Chawin Sitawarin, and David Wagner. 2025. StruQ: Defending Against Prompt Injection with Structured Queries. InUSENIX Security Symposium

2025
[38]

Sizhe Chen, Arman Zharmagambetov, Saeed Mahloujifar, Kamalika Chaudhuri, David Wagner, and Chuan Guo. 2025. SecAlign: Defending Against Prompt Injection with Preference Optimization. InACM Special Interest Group on Security, Audit and Control (SIGSAC)

2025
[39]

By Curtis Collicutt. 2024. Ignore All Previous Instructions and Do This Instead! Defending Against Prompt Injection. (2024). https://taico.ca/posts/defending- against-prompt-injection/

2024
[40]

Jian Cui, Mingming Zha, XiaoFeng Wang, and Xiaojing Liao. 2025. The Odyssey of robots.txt Governance: Measuring Convention Implications of Web Bots in Large Language Model Services. InProc. of the ACM Conference on Computer and Communications Security (CCS)

2025
[41]

Yu Cui, Sicheng Pan, Yifei Liu, Haibin Zhang, and Cong Zuo. 2026. Vortexpia: Indirect prompt injection attack against llms for efficient extraction of user privacy. InFindings of the Association for Computational Linguistics: EACL 2026. 587–609

2026
[42]

Rein Daelman. 2025. PromptPwnd: Prompt Injection Vulnerabilities in GitHub Actions Using AI Agents. (2025). https://www.aikido.dev/blog/promptpwnd- github-actions-ai-agents

2025
[43]

Gianluca De Stefano, Lea Schoenherr, and Giancarlo Pellegrino. 2024. Rag and roll: An end-to-end evaluation of indirect prompt manipulations in llm-based application frameworks.arXiv preprint 2408.05025(2024)

work page arXiv 2024
[44]

Edoardo Debenedetti, Jie Zhang, Mislav Balunovic, Luca Beurer-Kellner, Marc Fischer, and Florian Tramèr. 2024. AgentDojo: A Dynamic Environment to Evaluate Prompt Injection Attacks and Defenses for LLM Agents. InNeurIPS Datasets and Benchmarks

2024
[45]

Edoardo Debenedetti, Jie Zhang, Mislav Balunovic, Luca Beurer-Kellner, Marc Fischer, and Florian Tramèr. 2024. Agentdojo: A dynamic environment to eval- uate prompt injection attacks and defenses for llm agents.Advances in Neural Information Processing Systems (NeurIPS)(2024)

2024
[46]

Berkay Celik

Devin Ersoy, Brandon Lee, Ananth Shreekumar, Arjun Arunasalam, Muhammad Ibrahim, Antonio Bianchi, and Z. Berkay Celik. 2026. Investigating the Impact of Dark Patterns on LLM-Based Web Agents. InProc. of the IEEE Symposium on Security and Privacy (S&P)

2026
[47]

Ivan Evtimov, Arman Zharmagambetov, Aaron Grattafiori, Chuan Guo, and Kamalika Chaudhuri. 2025. WASP: Benchmarking Web Agent Security Against Prompt Injection Attacks. arXiv:2504.18575

work page arXiv 2025
[48]

Ariel Fogel and Dan Lisichkin. 2025. Anatomy of an Indirect Prompt Injection. (2025). https://www.pillar.security/blog/anatomy-of-an-indirect-prompt- injection

2025
[49]

Dan Goodin. 2025. Hackers exploit a blind spot by hiding malware inside DNS records. (2025). https://arstechnica.com/security/2025/07/hackers-exploit-a- blind-spot-by-hiding-malware-inside-dns-records/

2025
[50]

Kai Greshake, Sahar Abdelnabi, Shailesh Mishra, Christoph Endres, Thorsten Holz, and Mario Fritz. 2023. Not What You’ve Signed Up For: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection. InProc. of the ACM Workshop on Artificial Intelligence and Security (AISEC)

2023
[51]

Shuai Guan. 2025. What Does a Scraper Do? Exploring Functions and Benefits. (2025). https://thunderbit.com/blog/what-does-a-scraper-do

2025
[52]

Dennis Jacob, Hend Alzahrani, Zhanhao Hu, Basel Alomair, and David Wagner
[53]

PromptShield: Deployable Detection for Prompt Injection Attacks. InProc. of the ACM Conference on Data and Application Security and Privacy (CODASPY)
[54]

Myriam Jessier. 2025. Hidden prompt injection: The black hat trick AI outgrew. (2025). https://searchengineland.com/hidden-prompt-injection-black-hat-trick- ai-outgrew-462331

2025
[55]

Zhifeng Jiang, Zhihua Jin, and Guoliang He. 2024. PromptKeeper: Safeguarding System Prompts for LLMs.arXiv preprint 2412.13426(2024)

work page arXiv 2024
[56]

Yigitcan Kaya, Anton Landerer, Stijn Pletinckx, Michelle Zimmermann, Christo- pher Kruegel, and Giovanni Vigna. 2026. When AI Meets the Web: Prompt Injection Risks in Third-Party AI Chatbot Plugins. InProc. of the IEEE Symposium on Security and Privacy (S&P)

2026
[57]

Hanna Kim, Minkyoo Song, Seung Ho Na, Seungwon Shin, and Kimin Lee. 2025. When {LLMs} go online: The emerging threat of {Web-Enabled} {LLMs}. In USENIX Security Symposium (USENIX Security)

2025
[58]

Koster, G

M. Koster, G. Illyes, H. Zeller, and L. Sassman. 2022.Robots Exclusion Protocol. RFC 9309. IETF. https://www.rfc-editor.org/rfc/rfc9309.txt

2022
[59]

Andrey Labunets, Nishit V Pandya, Ashish Hooda, Xiaohan Fu, and Earlence Fernandes. 2025. Fun-tuning: Characterizing the vulnerability of proprietary llms to optimization-based prompt injection attacks via the fine-tuning interface. InIEEE Symposium on Security and Privacy (IEEE S&P)

2025
[60]

Victor Le Pochat, Tom Van Goethem, Samaneh Tajalizadehkhoob, Maciej Ko- rczyński, and Wouter Joosen. 2019. Tranco: A Research-Oriented Top Sites Ranking Hardened Against Manipulation. InNetwork and Distributed System Security Symposium (NDSS)

2019
[61]

Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdel rahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. 2019. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. InProc. of the Annual Meeting of the Association for Computational Linguistics (ACL). 13

2019
[62]

Evan Li, Tushin Mallick, Evan Rose, William Robertson, Alina Oprea, and Cristina Nita-Rotaru. 2026. ACE: A Security Architecture for LLM-Integrated App Sys- tems. InNetwork and Distributed System Security Symposium (NDSS)

2026
[63]

Xinfeng Li, Tianze Qiu, Yingbin Jin, Lixu Wang, Hanqing Guo, Xiaojun Jia, Xiaofeng Wang, and Wei Dong. 2026. WebCloak: Characterizing and Mitigating Threats from LLM-Driven Web Agents as Intelligent Scrapers. InProc. of the IEEE Symposium on Security and Privacy (S&P)

2026
[64]

Fengyu Liu, Yuan Zhang, Jiaqi Luo, Jiarun Dai, Tian Chen, Letian Yuan, Zhengmin Yu, Youkun Shi, Ke Li, Chengyuan Zhou, et al. 2025. Make agent defeat agent: Automatic detection of {Taint-Style} vulnerabilities in {LLM-based} agents. In USENIX Security Symposium (USENIX Security)

2025
[65]

Ruixuan Liu, Toan Tran, Tianhao Wang, Hongsheng Hu, Shuo Wang, and Li Xiong. 2026. ExpShield: Safeguarding Web Text from Unauthorized Crawling and LLM Exploitation. InNetwork and Distributed System Security Symposium (NDSS)

2026
[66]

Yupei Liu, Yuqi Jia, Runpeng Geng, Jinyuan Jia, and Neil Zhenqiang Gong. 2024. Formalizing and benchmarking prompt injection attacks and defenses. InUSENIX Security Symposium

2024
[67]

Yupei Liu, Yuqi Jia, Runpeng Geng, Jinyuan Jia, and Neil Zhenqiang Gong
[68]

In USENIX Security Symposium

Formalizing and Benchmarking Prompt Injection Attacks and Defenses. In USENIX Security Symposium
[69]

Yupei Liu, Yuqi Jia, Jinyuan Jia, Dawn Song, and Neil Zhenqiang Gong. 2025. Datasentinel: A game-theoretic detection of prompt injection attacks. In2025 IEEE Symposium on Security and Privacy (SP). IEEE, 2190–2208

2025
[70]

Duc Cuong Nguyen, Erik Derr, Michael Backes, and Sven Bugiel. 2019. Short Text, Large Effect: Measuring the Impact of User Reviews on Android App Security & Privacy. InProc. of the IEEE Symposium on Security and Privacy (S&P)

2019
[71]

Rodrigo Nogueira, Zhiying Jiang, Ronak Pradeep, and Jimmy J. Lin. 2020. Docu- ment Ranking with a Pretrained Sequence-to-Sequence Model.arXiv preprint 2003.06713v1(2020)

work page arXiv 2020
[72]

Alexandros Ntoulas, Marc Najork, Mark Manasse, and Dennis Fetterly. 2006. Detecting spam web pages through content analysis. InThe Web Conference

2006
[73]

OpenAI. 2025. Introducing ChatGPT agent: bridging research and action. (2025). https://openai.com/index/introducing-chatgpt-agent/

2025
[74]

2026.ChatGPT Atlas

OpenAI. 2026.ChatGPT Atlas. https://chatgpt.com/atlas/

2026
[75]

OWASP. [n.d.]. LLM Prompt Injection Prevention Cheat Sheet. https://cheatshe etseries.owasp.org/cheatsheets/LLM _Prompt_Injection_Prevention_Cheat_S heet.html
[76]

Pavan Reddy and Aditya Sanjay Gujral. 2025. EchoLeak: The First Real-World Zero-Click Prompt Injection Exploit in a Production LLM System. InAssociation for the Advancement of Artificial Intelligence Symposium (AAAI)

2025
[77]

Franziska Roesner, Tadayoshi Kohno, Ning Zhang, and Umar Iqbal. 2025. Iso- lateGPT: An Execution Isolation Architecture for LLM-Based Agentic Systems. InNetwork and Distributed System Security Symposium (NDSS)

2025
[78]

Sander Schulhoff. 2025. Types of Prompt Injection. (2025). https://learnprompti ng.org/docs/prompt_hacking/injection

2025
[79]

Avital Shafran, Roei Schuster, and Vitaly Shmatikov. 2025. Machine against the RAG: jamming retrieval-augmented generation with blocker documents. In USENIX Security Symposium

2025
[80]

Xinyue Shen, Zeyuan Chen, Michael Backes, Yun Shen, and Yang Zhang. 2024. Do anything now: Characterizing and evaluating in-the-wild jailbreak prompts on large language models. InProc. of the ACM Conference on Computer and Communications Security (CCS)

2024

Showing first 80 references.