Local” exposes the agent’s own past reasoning; “Shared

Zhengchun Shang, Wenlan Wei · 2025 · arXiv 2504.02080

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

Persona Attack: Incremental Memory Injection Jailbreak Attack against Large Language Models

cs.CR · 2026-05-29 · unverdicted · novelty 7.0

Persona Attack uses step-by-step memory injections to achieve up to 95% success in making LLMs ignore safety alignments, with effectiveness depending on model memory and instruction combinations.

Architecture Matters for Multi-Agent Security

cs.MA · 2026-04-25 · unverdicted · novelty 6.0

Multi-agent AI systems are more vulnerable to attacks than single agents in most tested designs, with attack success rates varying up to 3.8 times depending on how roles, communication, and memory are structured.

citing papers explorer

Showing 2 of 2 citing papers.

Persona Attack: Incremental Memory Injection Jailbreak Attack against Large Language Models cs.CR · 2026-05-29 · unverdicted · none · ref 8
Persona Attack uses step-by-step memory injections to achieve up to 95% success in making LLMs ignore safety alignments, with effectiveness depending on model memory and instruction combinations.
Architecture Matters for Multi-Agent Security cs.MA · 2026-04-25 · unverdicted · none · ref 2
Multi-agent AI systems are more vulnerable to attacks than single agents in most tested designs, with attack success rates varying up to 3.8 times depending on how roles, communication, and memory are structured.

Local” exposes the agent’s own past reasoning; “Shared

fields

years

verdicts

representative citing papers

citing papers explorer