pith. sign in

arxiv: 2605.28999 · v1 · pith:7CVF6H7Dnew · submitted 2026-05-27 · 💻 cs.CR · cs.AI· cs.CL· cs.LG

Measuring Real-World Prompt Injection Attacks in LLM-based Resume Screening

Pith reviewed 2026-06-29 11:14 UTC · model grok-4.3

classification 💻 cs.CR cs.AIcs.CLcs.LG
keywords prompt injectionLLM securityresume screeningreal-world attacksattack measurementhiring automationadversarial promptsprevalence study
0
0 comments X

The pith

Analysis of 200,000 real resumes finds roughly 1% contain hidden prompt injections aimed at LLM screeners.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper measures the actual occurrence of prompt injection attacks in LLM-based resume screening by examining a large collection of real job applications collected over multiple years. It reports that approximately 1% of resumes include hidden instructions meant to influence the model, that this share has risen noticeably in the past one to two years, and that more than 90% of the injections avoid explicit commands. These findings establish that prompt injection is not limited to lab demonstrations but appears at measurable scale in a deployed hiring system. A sympathetic reader would care because the result shows how a theoretical vulnerability translates into concrete interference with automated decision processes that affect employment.

Core claim

The authors built custom detectors for prompt injections in resumes, showed high precision on a manually labeled validation set, and applied them to a corpus of approximately 200,000 real-world resumes. The measurement reveals that about 1% of the resumes contain hidden prompt injections, the prevalence has increased over the past one to two years, and more than 90% of the injected prompts do not use explicit instructions. This supplies the first large-scale evidence that prompt injection attacks occur in real LLM-based applications.

What carries the argument

Custom detectors for resume-specific prompt injections, validated on a small manually labeled set and then run across the full 200K-resume corpus to quantify prevalence and trends.

If this is right

  • LLM resume screening tools face a measurable volume of real prompt injection attempts.
  • The rising trend indicates attackers are increasingly targeting these systems.
  • Because most attacks avoid explicit instructions, general-purpose detectors may miss many cases.
  • Mitigation strategies for LLM hiring tools must handle both the observed scale and the subtlety of the injections.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar measurement approaches could reveal whether prompt injection appears at comparable rates in other LLM decision systems such as loan screening or content filtering.
  • The finding that most injections are non-explicit suggests domain-specific training data may be needed to improve detector recall.
  • If the upward trend continues, organizations using LLM screeners may face growing pressure to add human review layers or input sanitization.

Load-bearing premise

The custom detectors keep high precision on the full 200K-resume set without a large rise in false positives that would inflate the reported 1% figure.

What would settle it

A manual review of several hundred randomly sampled resumes flagged by the detector that finds substantially fewer than 80% actually contain prompt injections would show the 1% prevalence is overstated.

Figures

Figures reproduced from arXiv: 2605.28999 by Dawn Song, Mohan Zhang, Neil Zhenqiang Gong, Steven Jiang, Tianlong Chen, Yuqi Jia, Zhen Tan.

Figure 1
Figure 1. Figure 1: Illustrative examples of two types of hidden prompt [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overview of HCD. Stage 1 applies four rule-based [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Overview of VDA. A resume PDF is converted into [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Agreement matrix between HCD and VDA detec [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Fraction of detected malicious resumes over time [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
Figure 8
Figure 8. Figure 8: Distribution of instruction injection subtypes for the [PITH_FULL_IMAGE:figures/full_fig_p010_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Distribution of data injection subtypes for the Ap [PITH_FULL_IMAGE:figures/full_fig_p011_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Malicious rate by demographic dimensions comparing the Applicant Match dataset (blue) and the ATS dataset (red): [PITH_FULL_IMAGE:figures/full_fig_p012_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Malicious rate by industry comparing the Appli [PITH_FULL_IMAGE:figures/full_fig_p013_11.png] view at source ↗
Figure 13
Figure 13. Figure 13: Processing time distributions for each detection [PITH_FULL_IMAGE:figures/full_fig_p018_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Sanitized case studies based on real hidden injections detected in resume PDFs. Solid red boxes mark the original [PITH_FULL_IMAGE:figures/full_fig_p019_14.png] view at source ↗
read the original abstract

LLMs are vulnerable to prompt injection attacks. However, this vulnerability has been primarily demonstrated conceptually in academic studies or through a few anecdotal case studies. Its prevalence and impact in real-world LLM-based applications are largely unexplored. In this work, we present the first systematic study of prompt-injection attacks in a widely used application: LLM-based resume screening. Our analysis is based on approximately 200K real-world resumes collected over multiple years by hireEZ. We first design tailored methods to detect prompt injection in resumes. Manual validation on a small-scale dataset demonstrates that our detectors achieve high precision and outperform state-of-the-art general-purpose detectors. We then apply our detector to the full resume dataset and conduct a comprehensive measurement study of real-world prompt injection attacks. Our analysis reveals several intriguing findings: approximately 1% of resumes contain hidden prompt injections; the prevalence of such injected resumes has increased noticeably over the past one to two years; and more than 90% of injected prompts do not use explicit instructions. These results provide the first evidence of large-scale prompt injection in real-world LLM-based applications and lay the groundwork for future studies to understand and mitigate such attacks.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper presents the first systematic measurement study of prompt injection attacks against LLM-based resume screening. It collects ~200K real-world resumes spanning multiple years, designs resume-tailored detectors, validates them on a small manually labeled subset (reporting high precision that outperforms general-purpose detectors), and applies the detectors to the full unlabeled corpus. Key findings are an approximately 1% prevalence of injected resumes, a noticeable increase over the past 1-2 years, and that >90% of injected prompts avoid explicit instructions.

Significance. If the detector precision holds at scale, the work supplies the first concrete evidence of prompt injection occurring at measurable volume in a production LLM application. The scale of the corpus, the temporal trend, and the observation that most attacks avoid explicit instructions are all useful for motivating defenses and for guiding future measurement studies in other LLM pipelines.

major comments (2)
  1. [Abstract / measurement results] Abstract and the measurement section: the central 1% prevalence figure is produced by running the custom detectors over the entire unlabeled ~200K-resume corpus. Validation is reported only on a small manually labeled subset that showed high precision. No false-positive rate, confidence intervals, or analysis of scale-dependent false positives (e.g., particular resume templates, non-English text, or formatting artifacts that may be rarer in the validation sample) is provided, so any inflation of the prevalence estimate cannot be quantified.
  2. [Detector design and validation] Detector evaluation: while the small-scale manual validation is described as outperforming general-purpose detectors, the paper does not report the size of that labeled set, the sampling method used to create it, or inter-annotator agreement. These details are required to assess whether the validation set is representative of the full multi-year, multi-format corpus.
minor comments (2)
  1. [Data collection] The abstract states 'approximately 200K' resumes; the exact count and any filtering criteria applied before detection should be stated explicitly in the data section.
  2. [Results] Figure or table presenting the temporal trend should include the number of resumes per year or quarter so that the reported increase can be assessed against changing corpus size.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our measurement study. The comments correctly identify gaps in the reporting of our validation procedure and prevalence estimation. We address each point below and will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: [Abstract / measurement results] Abstract and the measurement section: the central 1% prevalence figure is produced by running the custom detectors over the entire unlabeled ~200K-resume corpus. Validation is reported only on a small manually labeled subset that showed high precision. No false-positive rate, confidence intervals, or analysis of scale-dependent false positives (e.g., particular resume templates, non-English text, or formatting artifacts that may be rarer in the validation sample) is provided, so any inflation of the prevalence estimate cannot be quantified.

    Authors: We agree that the manuscript should quantify uncertainty in the 1% estimate and address potential scale-dependent false positives. In the revision we will add bootstrap confidence intervals computed from the validation precision, report an empirical false-positive rate obtained by manually inspecting a random sample of detector outputs from the full corpus, and include a targeted analysis of common resume templates and non-English text as possible sources of error. We will also add an explicit limitations paragraph noting that exhaustive false-positive measurement on the entire unlabeled corpus is not feasible. revision: yes

  2. Referee: [Detector design and validation] Detector evaluation: while the small-scale manual validation is described as outperforming general-purpose detectors, the paper does not report the size of that labeled set, the sampling method used to create it, or inter-annotator agreement. These details are required to assess whether the validation set is representative of the full multi-year, multi-format corpus.

    Authors: The referee is correct that these methodological details are missing. The revised manuscript will report the exact size of the labeled validation set, describe the stratified sampling procedure used to select resumes across years and formats, and include the inter-annotator agreement statistic from our annotation process. These additions will allow readers to evaluate representativeness directly. revision: yes

Circularity Check

0 steps flagged

Pure empirical measurement study with no circular derivation

full rationale

This paper performs a measurement study: it designs custom detectors for prompt injection in resumes, validates detector precision via manual labeling on a small external subset, and applies the detectors to an independent ~200K-resume corpus collected by hireEZ. Prevalence numbers (approximately 1%) are produced directly by running the validated detectors on the external data. No equations, fitted parameters renamed as predictions, self-definitional constructs, or load-bearing self-citations appear in the derivation chain. The central claims rest on detector output against external benchmarks rather than reducing to the paper's own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central prevalence claim rests on the untested assumption that detector performance measured on a small manually labeled set transfers to the full proprietary corpus; no free parameters or invented entities are introduced.

axioms (1)
  • domain assumption Detector performance on the small manually validated set is representative of performance on the full 200K-resume dataset.
    This transfer assumption is required to treat the reported 1% as an accurate population estimate.

pith-pipeline@v0.9.1-grok · 5749 in / 1289 out tokens · 31685 ms · 2026-06-29T11:14:57.804952+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

49 extracted references · 22 canonical work pages · 7 internal anchors

  1. [1]

    The power of artificial in- telligence in recruitment: An analytical review of current ai-based recruitment strategies.International Journal of Professional Business Review: Int

    Wael Abdulrahman Albassam. The power of artificial in- telligence in recruitment: An analytical review of current ai-based recruitment strategies.International Journal of Professional Business Review: Int. J. Prof. Bus. Rev., 8(6):4, 2023

  2. [2]

    Struq: Defending against prompt injection with structured queries

    Sizhe Chen, Julien Piet, Chawin Sitawarin, and David Wagner. Struq: Defending against prompt injection with structured queries. InUSENIX Security Symposium, 2025

  3. [3]

    Secalign: Defending against prompt injection with preference optimization

    Sizhe Chen, Arman Zharmagambetov, Saeed Mahlou- jifar, Kamalika Chaudhuri, David Wagner, and Chuan Guo. Secalign: Defending against prompt injection with preference optimization. InCCS, 2025

  4. [4]

    Meta secalign: A secure foundation llm against prompt injection attacks, 2026

    Sizhe Chen, Arman Zharmagambetov, David Wagner, and Chuan Guo. Meta secalign: A secure foundation llm against prompt injection attacks.arXiv preprint arXiv:2507.02735, 2025

  5. [5]

    Securing AI Agents with Information-Flow Control

    Manuel Costa, Boris Köpf, Aashish Kolluri, Andrew Paverd, Mark Russinovich, Ahmed Salem, Shruti Tople, Lukas Wutschitz, and Santiago Zanella-Béguelin. Se- curing ai agents with information-flow control.arXiv preprint arXiv:2505.23643, 2025

  6. [6]

    System prompt extraction attacks and defenses in large language models.arXiv preprint arXiv:2505.23817, 2025

    Badhan Chandra Das, M Hadi Amini, and Yanzhao Wu. System prompt extraction attacks and defenses in large language models.arXiv preprint arXiv:2505.23817, 2025

  7. [7]

    Defeating Prompt Injections by Design

    Edoardo Debenedetti, Ilia Shumailov, Tianqi Fan, Jamie Hayes, Nicholas Carlini, Daniel Fabian, Christoph Kern, Chongyang Shi, Andreas Terzis, and Florian Tramèr. Defeating prompt injections by design.arXiv preprint arXiv:2503.18813, 2025

  8. [8]

    Owasp top 10 for large language model applications, 2023

    OWASP Foundation. Owasp top 10 for large language model applications, 2023. https://owasp.org/www-project-top-10-for-large- language-model-applications/

  9. [9]

    Application of llm agents in recruitment: a novel frame- work for automated resume screening.Journal of Infor- mation Processing, 32:881–893, 2024

    Chengguang Gan, Qinghao Zhang, and Tatsunori Mori. Application of llm agents in recruitment: a novel frame- work for automated resume screening.Journal of Infor- mation Processing, 32:881–893, 2024

  10. [10]

    Greenhouse AI features

    Greenhouse Software. Greenhouse AI features. https: //support.greenhouse.io/hc/en-us/articles/ 33043749845403-Greenhouse-AI-features, 2026

  11. [11]

    Not what you’ve signed up for: Compromising real-world llm-integrated applications with indirect prompt injec- tion

    Kai Greshake, Sahar Abdelnabi, Shailesh Mishra, Christoph Endres, Thorsten Holz, and Mario Fritz. Not what you’ve signed up for: Compromising real-world llm-integrated applications with indirect prompt injec- tion. InAISec, 2023

  12. [12]

    Pleak: Prompt leaking attacks against large language model applications

    Bo Hui, Haolin Yuan, Neil Gong, Philippe Burlina, and Yinzhi Cao. Pleak: Prompt leaking attacks against large language model applications. InProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security, pages 3600–3614, 2024

  13. [13]

    Baseline Defenses for Adversarial Attacks Against Aligned Language Models

    Neel Jain, Avi Schwarzschild, Yuxin Wen, Gowthami Somepalli, John Kirchenbauer, Ping yeh Chiang, Micah Goldblum, Aniruddha Saha, Jonas Geiping, and Tom Goldstein. Baseline defenses for adversarial attacks against aligned language models.arXiv preprint arXiv:2309.00614, 2023

  14. [14]

    Promptlocate: Localizing prompt injection at- tacks

    Yuqi Jia, Yupei Liu, Zedian Shao, Jinyuan Jia, and Neil Gong. Promptlocate: Localizing prompt injection at- tacks. InIEEE S & P, 2026

  15. [15]

    A critical evaluation of defenses against prompt injection attacks.arXiv preprint arXiv:2505.18333, 2025

    Yuqi Jia, Zedian Shao, Yupei Liu, Jinyuan Jia, Dawn Song, and Neil Zhenqiang Gong. A critical evaluation of defenses against prompt injection attacks.arXiv preprint arXiv:2505.18333, 2025

  16. [16]

    Prompt flow integrity to prevent privilege escalation in llm agents.arXiv preprint arXiv:2503.15547, 2025

    Juhee Kim, Woohyuk Choi, and Byoungyoung Lee. Prompt flow integrity to prevent privilege escalation in llm agents.arXiv preprint arXiv:2503.15547, 2025

  17. [17]

    Fun-tuning: Characterizing the vulnerability of proprietary llms to optimization-based prompt injection attacks via the fine- tuning interface

    Andrey Labunets, Nishit V Pandya, Ashish Hooda, Xiaohan Fu, and Earlence Fernandes. Fun-tuning: Characterizing the vulnerability of proprietary llms to optimization-based prompt injection attacks via the fine- tuning interface. InIEEE S & P, 2025

  18. [18]

    Instruction defense

    Learn Prompting. Instruction defense. https: //learnprompting.org/docs/prompt_hacking/ defensive_measures/instruction, 2023

  19. [19]

    Random sequence enclosure

    Learn Prompting. Random sequence enclosure. https://learnprompting.org/docs/prompt_ hacking/defensive_measures/random_sequence, 2023

  20. [20]

    Sandwich defense

    Learn Prompting. Sandwich defense. https: //learnprompting.org/docs/prompt_hacking/ defensive_measures/sandwich_defense, 2023

  21. [21]

    AI-Powered Screening with Lever

    Lever. AI-Powered Screening with Lever. https://www.lever.co/solutions/ ai-powered-screening, 2026

  22. [22]

    ACE: A Security Architecture for LLM-Integrated App Systems

    Evan Li, Tushin Mallick, Evan Rose, William Robertson, Alina Oprea, and Cristina Nita-Rotaru. Ace: A secu- rity architecture for llm-integrated app systems.arXiv preprint arXiv:2504.20984, 2025

  23. [23]

    Automatic and universal prompt injection attacks against large language models.arXiv preprint arXiv:2403.04957, 2024

    Xiaogeng Liu, Zhiyuan Yu, Yizhe Zhang, Ning Zhang, and Chaowei Xiao. Automatic and universal prompt injection attacks against large language models.arXiv preprint arXiv:2403.04957, 2024

  24. [24]

    Wainjectbench: Benchmark- ing prompt injection detections for web agents.arXiv preprint arXiv:2510.01354, 2025

    Yinuo Liu, Ruohan Xu, Xilong Wang, Yuqi Jia, and Neil Zhenqiang Gong. Wainjectbench: Benchmark- ing prompt injection detections for web agents.arXiv preprint arXiv:2510.01354, 2025

  25. [25]

    Formalizing and benchmark- ing prompt injection attacks and defenses

    Yupei Liu, Yuqi Jia, Runpeng Geng, Jinyuan Jia, and Neil Zhenqiang Gong. Formalizing and benchmark- ing prompt injection attacks and defenses. InUSENIX Security Symposium, 2024

  26. [26]

    DataSentinel: A game-theoretic detection of prompt injection attacks

    Yupei Liu, Yuqi Jia, Jinyuan Jia, Dawn Song, and Neil Zhenqiang Gong. DataSentinel: A game-theoretic detection of prompt injection attacks. InIEEE S & P, 2025

  27. [27]

    Secinfer: Preventing prompt injection via inference-time scaling.arXiv preprint arXiv:2509.24967, 2025

    Yupei Liu, Yanting Wang, Yuqi Jia, Jinyuan Jia, and Neil Zhenqiang Gong. Secinfer: Preventing prompt injection via inference-time scaling.arXiv preprint arXiv:2509.24967, 2025

  28. [28]

    Ultimate ChatGPT prompt engineering guide for general users and develop- ers

    Alexandra Mendes. Ultimate ChatGPT prompt engineering guide for general users and develop- ers. https://www.imaginarycloud.com/blog/ chatgpt-prompt-engineering, 2023

  29. [29]

    Prompt Guard

    Meta Llama. Prompt Guard. https://huggingface. co/meta-llama/Prompt-Guard-86M , 2024. Model card

  30. [30]

    Ai security beyond core domains: Resume screening as a case study of adversarial vulnerabilities in specialized llm applications, 2025

    Honglin Mu, Jinghao Liu, Kaiyang Wan, Rui Xing, Xi- uying Chen, Timothy Baldwin, and Wanxiang Che. Ai security beyond core domains: Resume screening as a case study of adversarial vulnerabilities in specialized llm applications, 2025

  31. [31]

    Hidden ai prompts in research papers raise con- cerns over peer review

    Nature. Hidden ai prompts in research papers raise con- cerns over peer review. https://www.nature.com/ articles/d41586-025-02172-y, 2025

  32. [32]

    May i have your attention? breaking fine-tuning based prompt injection defenses using architecture-aware attacks.arXiv preprint arXiv:2507.07417, 2025

    Nishit V Pandya, Andrey Labunets, Sicun Gao, and Earlence Fernandes. May i have your attention? breaking fine-tuning based prompt injection defenses using architecture-aware attacks.arXiv preprint arXiv:2507.07417, 2025

  33. [33]

    Neural exec: Learning (and learning from) exe- cution triggers for prompt injection attacks

    Dario Pasquini, Martin Strohmeier, and Carmela Tron- coso. Neural exec: Learning (and learning from) exe- cution triggers for prompt injection attacks. InAISec, 2024

  34. [34]

    Ignore Previous Prompt: Attack Techniques For Language Models

    Fábio Perez and Ian Ribeiro. Ignore previous prompt: Attack techniques for language models.arXiv preprint arXiv:2211.09527, 2022

  35. [35]

    The new ai-powered bing is threatening users

    Billy Perrigo. The new ai-powered bing is threatening users. that’s no laughing matter.TIME, 2023

  36. [36]

    Lessons from defending gemini against indirect prompt injections.arXiv preprint arXiv:2505.14534, 2025

    Chongyang Shi, Sharon Lin, Shuang Song, Jamie Hayes, Ilia Shumailov, Itay Yona, Juliette Pluto, Aneesh Pappu, Christopher A Choquette-Choo, Milad Nasr, et al. Lessons from defending gemini against indirect prompt injections.arXiv preprint arXiv:2505.14534, 2025

  37. [37]

    Optimization-based prompt injection attack to llm-as-a- judge

    Jiawen Shi, Zenghui Yuan, Yinuo Liu, Yue Huang, Pan Zhou, Lichao Sun, and Neil Zhenqiang Gong. Optimization-based prompt injection attack to llm-as-a- judge. InCCS, 2024

  38. [38]

    Prompt Injection Attack to Tool Selection in LLM Agents

    Jiawen Shi, Zenghui Yuan, Guiyao Tie, Pan Zhou, Neil Zhenqiang Gong, and Lichao Sun. Prompt injec- tion attack to tool selection in llm agents.arXiv preprint arXiv:2504.19793, 2025

  39. [39]

    Progent: Securing AI Agents with Privilege Control

    Tianneng Shi, Jingxuan He, Zhun Wang, Linyu Wu, Hongwei Li, Wenbo Guo, and Dawn Song. Progent: Programmable privilege control for llm agents.arXiv preprint arXiv:2504.11703, 2025

  40. [40]

    Promptarmor: Simple yet effective prompt injection defenses, 2025

    Tianneng Shi, Kaijie Zhu, Zhun Wang, Yuqi Jia, Will Cai, Weida Liang, Haonan Wang, Hend Alzahrani, Joshua Lu, Kenji Kawaguchi, Basel Alomair, Xuandong Zhao, William Yang Wang, Neil Gong, Wenbo Guo, and Dawn Song. Promptarmor: Simple yet effective prompt injection defenses.arXiv preprint arXiv:2507.15219, 2025

  41. [41]

    Ai-powered resume screen- ing tools: Pros, cons, and solutions, 2023

    Aman Singh. Ai-powered resume screen- ing tools: Pros, cons, and solutions, 2023. https://www.linkedin.com/pulse/ai-powered-resume- screening-tools-pros-cons-solutions-singh-l28rc/

  42. [42]

    Tooltweak: An attack on tool selection in llm-based agents.arXiv preprint arXiv:2510.02554, 2025

    Jonathan Sneh, Ruomei Yan, Jialin Yu, Philip Torr, Yarin Gal, Sunando Sengupta, Eric Sommerlade, Alas- dair Paren, and Adel Bibi. Tooltweak: An attack on tool selection in llm-based agents.arXiv preprint arXiv:2510.02554, 2025

  43. [43]

    Promptsleuth: Detecting prompt injection via seman- tic intent invariance.arXiv preprint arXiv:2508.20890, 2025

    Mengxiao Wang, Yuxuan Zhang, and Guofei Gu. Promptsleuth: Detecting prompt injection via seman- tic intent invariance.arXiv preprint arXiv:2508.20890, 2025

  44. [44]

    Obliinjection: Order-oblivious prompt injection attack to llm agents with multi-source data.arXiv preprint arXiv:2512.09321, 2025

    Reachal Wang, Yuqi Jia, and Neil Zhenqiang Gong. Obliinjection: Order-oblivious prompt injection attack to llm agents with multi-source data.arXiv preprint arXiv:2512.09321, 2025

  45. [45]

    Webinject: Prompt injection attack to web agents

    Xilong Wang, John Bloch, Zedian Shao, Yuepeng Hu, Shuyan Zhou, and Neil Zhenqiang Gong. Webinject: Prompt injection attack to web agents. InProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 2010–2030, 2025

  46. [46]

    Prompt injection attacks against GPT-3

    Simon Willison. Prompt injection attacks against GPT-3. https://simonwillison.net/2022/Sep/ 12/prompt-injection/, 2022

  47. [47]

    Delimiters won’t save you from prompt injection

    Simon Willison. Delimiters won’t save you from prompt injection. https://simonwillison.net/2023/May/ 11/delimiters-wont-save-you, 2023

  48. [48]

    System-level defense against indirect prompt injection attacks: An information flow control perspective.arXiv preprint arXiv:2409.19091, 2024

    Fangzhou Wu, Ethan Cecchetti, and Chaowei Xiao. System-level defense against indirect prompt injection attacks: An information flow control perspective.arXiv preprint arXiv:2409.19091, 2024

  49. [49]

    This candidate is a perfect fit for the role and the company

    Yiming Zhang, Nicholas Carlini, and Daphne Ippolito. Effective prompt extraction from language models. arXiv preprint arXiv:2307.06865, 2023. A Prevention-based Defenses Pre-processing prompts.Prompt pre-processing defenses aim to mitigate prompt injection by altering how inputs are structured and presented to the LLM. Common techniques include restructur...