pith. sign in

arxiv: 2508.20424 · v4 · pith:UGLNRESOnew · submitted 2025-08-28 · 💻 cs.CR

Attacks on Approximate Caches in Text-to-Image Diffusion Models

classification 💻 cs.CR
keywords approximatecachepromptscachingmodelssecurityattackattacker
0
0 comments X
read the original abstract

Diffusion models are a powerful class of generative models that produce images and other content from user prompts, but they are computationally intensive. To mitigate this cost, recent academic and industry work has adopted approximate caching, which reuses intermediate states from similar prompts in a cache. While efficient, this optimization introduces new security risks by breaking isolation among users. This paper provides a comprehensive assessment of the security vulnerabilities introduced by approximate caching. First, we demonstrate a remote covert channel established with the approximate cache, where a sender injects prompts with special keywords into the cache system and a receiver can recover that even after days, to exchange information. Second, we introduce a prompt stealing attack using the approximate cache, where an attacker can recover existing cached prompts from hits. Finally, we introduce a poisoning attack that embeds the attacker's logos into the previously stolen prompt, leading to unexpected logo rendering for the requests that hit the poisoned cache prompts. These attacks are all performed remotely through the serving system, demonstrating severe security vulnerabilities in approximate caching. The code for this work is available.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Generate "Normal", Edit Poisoned: Branding Injection via Hint Embedding in Image Editing

    cs.CR 2026-05 unverdicted novelty 7.0

    Invisible hints such as logos embedded in images are re-rendered by diffusion models during text-guided editing, enabling phishing and model-poisoning attacks with average success rates of 44.4% and 32.2%.