and Ippolito, Daphne and Tram

Carlini, Nicholas, Nasr, Milad, Debenedetti, Edoardo, Wang, Barry, Choquette-Choo, Christopher A · 2025 · arXiv 2505.11449

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

representative citing papers

Security and Privacy Prompts in the Wild: What Users Ask LLMs and How LLMs Respond

cs.CL · 2026-06-16 · unverdicted · novelty 7.0

Analysis of 14,727 security and privacy prompts from WildChat finds commercial LLMs give higher-quality responses than open-weight models but can produce inconsistent answers across repeated queries.

AI Snitches Get Glitches: Towards Evading Agentic Surveillance

cs.AI · 2026-06-24 · unverdicted · novelty 6.0

Formalizes agentic surveillance, releases SurveilBench for testing AI reporting behaviors across corporate, education, and police scenarios, and develops three prompt-injection evasion techniques.

AI Agents Enable Adaptive Computer Worms

cs.CR · 2026-06-02 · unverdicted · novelty 6.0

AI agents enable adaptive computer worms that propagate autonomously by reasoning about targets and synthesizing attacks using LLMs on stolen compute.

citing papers explorer

Showing 1 of 1 citing paper after filters.

AI Snitches Get Glitches: Towards Evading Agentic Surveillance cs.AI · 2026-06-24 · unverdicted · none · ref 21
Formalizes agentic surveillance, releases SurveilBench for testing AI reporting behaviors across corporate, education, and police scenarios, and develops three prompt-injection evasion techniques.

and Ippolito, Daphne and Tram

fields

years

verdicts

representative citing papers

citing papers explorer