Shared-embedding sequence models cannot achieve Semantic-Faithful Control over control-authoritative actions due to provenance-recovery impossibility, control-path exposure, and finite-coverage invariance gap.
attacker goals
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CR 3years
2026 3verdicts
UNVERDICTED 3representative citing papers
Prompt injection defenses create a security-fidelity tradeoff with no model or defense achieving both high security and high fidelity on the SecFid benchmark across 1,168 examples.
Black-box optimization outperforms gradient-based methods for prompt injection on LLM agents, with success depending on attacker model strength and limited transfer from small to frontier models.
citing papers explorer
-
On the Inseparability of Instructions and Data in Shared-Embedding Sequence Models
Shared-embedding sequence models cannot achieve Semantic-Faithful Control over control-authoritative actions due to provenance-recovery impossibility, control-path exposure, and finite-coverage invariance gap.
-
Security--Fidelity Tradeoffs: The Hidden Cost of Prompt Injection Defense
Prompt injection defenses create a security-fidelity tradeoff with no model or defense achieving both high security and high fidelity on the SecFid benchmark across 1,168 examples.
-
Assessing Automated Prompt Injection Attacks in Agentic Environments
Black-box optimization outperforms gradient-based methods for prompt injection on LLM agents, with success depending on attacker model strength and limited transfer from small to frontier models.