Trojan Hippo attacks on LLM agent memory achieve 85-100% success rates in data exfiltration across four memory backends even after 100 benign sessions, while evaluated defenses reduce success rates but impose varying utility costs.
Title resolution pending
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4representative citing papers
Oversight strategy in computer-use agents shapes exposure to problematic actions more reliably than correction success, with plan-based approaches reducing occurrences but not uniformly improving interventions.
GrantBox evaluates LLM agents using real-world tools and finds they remain vulnerable to sophisticated prompt injection attacks with an 84.80% average success rate.
LLM assistance shortens idea-generation periods and reduces creative moments during programming tasks while yielding solutions with comparable idea counts and greater functional correctness.
citing papers explorer
-
Trojan Hippo: Weaponizing Agent Memory for Data Exfiltration
Trojan Hippo attacks on LLM agent memory achieve 85-100% success rates in data exfiltration across four memory backends even after 100 benign sessions, while evaluated defenses reduce success rates but impose varying utility costs.
-
Comparing Human Oversight Strategies for Computer-Use Agents
Oversight strategy in computer-use agents shapes exposure to problematic actions more reliably than correction success, with plan-based approaches reducing occurrences but not uniformly improving interventions.
-
Evaluating Privilege Usage of Agents with Real-World Tools
GrantBox evaluates LLM agents using real-world tools and finds they remain vulnerable to sophisticated prompt injection attacks with an 84.80% average success rate.
-
"Like Taking the Path of Least Resistance": Exploring the Impact of LLM Interaction on the Creative Process of Programming
LLM assistance shortens idea-generation periods and reduces creative moments during programming tasks while yielding solutions with comparable idea counts and greater functional correctness.