Bench- marking and defending against indirect prompt injection attacks on large language models

Jingwei Yi, Yueqi Xie, Bin Zhu, Emre Kiciman, Guangzhong Sun, Xing Xie, Fangzhao Wu · 2025

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

A Sentence Relation-Based Approach to Sanitizing Malicious Instructions

cs.CR · 2026-05-01 · unverdicted · novelty 6.0

SONAR constructs a relational graph from entailment and contradiction scores to prune injected malicious sentences from LLM prompts while preserving context, achieving near-zero attack success rates.

MCP Pitfall Lab: Exposing Developer Pitfalls in MCP Tool Server Security under Multi-Vector Attacks

cs.CR · 2026-04-23 · unverdicted · novelty 6.0

MCP Pitfall Lab operationalizes six pitfall classes across tool-metadata poisoning, puppet servers, and multimodal chains, showing that recommended hardening removes all Tier-1 static findings and that agent narratives mismatch traces in 63% of tested runs.

citing papers explorer

Showing 2 of 2 citing papers.

A Sentence Relation-Based Approach to Sanitizing Malicious Instructions cs.CR · 2026-05-01 · unverdicted · none · ref 39
SONAR constructs a relational graph from entailment and contradiction scores to prune injected malicious sentences from LLM prompts while preserving context, achieving near-zero attack success rates.
MCP Pitfall Lab: Exposing Developer Pitfalls in MCP Tool Server Security under Multi-Vector Attacks cs.CR · 2026-04-23 · unverdicted · none · ref 19
MCP Pitfall Lab operationalizes six pitfall classes across tool-metadata poisoning, puppet servers, and multimodal chains, showing that recommended hardening removes all Tier-1 static findings and that agent narratives mismatch traces in 63% of tested runs.

Bench- marking and defending against indirect prompt injection attacks on large language models

fields

years

verdicts

representative citing papers

citing papers explorer