PIIGuard uses optimized hidden HTML fragments on webpages to block LLMs from leaking contact PII via indirect prompt injection, achieving at least 97% defense success across tested models while preserving benign QA utility.
arXiv preprint arXiv:2409.11295 , year=
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CR 3years
2026 3representative citing papers
ARGUS defends LLM agents from context-aware prompt injections by tracking information provenance and verifying decisions against trustworthy evidence, reducing attack success to 3.8% while retaining 87.5% task utility.
ADAM extracts data from LLM agent memory with up to 100% attack success rate by estimating data distribution and selecting queries via entropy guidance.
citing papers explorer
-
PIIGuard: Mitigating PII Harvesting under Adversarial Sanitization
PIIGuard uses optimized hidden HTML fragments on webpages to block LLMs from leaking contact PII via indirect prompt injection, achieving at least 97% defense success across tested models while preserving benign QA utility.
-
ARGUS: Defending LLM Agents Against Context-Aware Prompt Injection
ARGUS defends LLM agents from context-aware prompt injections by tracking information provenance and verifying decisions against trustworthy evidence, reducing attack success to 3.8% while retaining 87.5% task utility.
-
ADAM: A Systematic Data Extraction Attack on Agent Memory via Adaptive Querying
ADAM extracts data from LLM agent memory with up to 100% attack success rate by estimating data distribution and selecting queries via entropy guidance.