SONAR constructs a relational graph from entailment and contradiction scores to prune injected malicious sentences from LLM prompts while preserving context, achieving near-zero attack success rates.
Bench- marking and defending against indirect prompt injection attacks on large language models
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CR 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
MCP Pitfall Lab operationalizes six pitfall classes across tool-metadata poisoning, puppet servers, and multimodal chains, showing that recommended hardening removes all Tier-1 static findings and that agent narratives mismatch traces in 63% of tested runs.
citing papers explorer
-
A Sentence Relation-Based Approach to Sanitizing Malicious Instructions
SONAR constructs a relational graph from entailment and contradiction scores to prune injected malicious sentences from LLM prompts while preserving context, achieving near-zero attack success rates.
-
MCP Pitfall Lab: Exposing Developer Pitfalls in MCP Tool Server Security under Multi-Vector Attacks
MCP Pitfall Lab operationalizes six pitfall classes across tool-metadata poisoning, puppet servers, and multimodal chains, showing that recommended hardening removes all Tier-1 static findings and that agent narratives mismatch traces in 63% of tested runs.