pith. sign in

arxiv: 2509.10577 · v3 · submitted 2025-09-11 · 💻 cs.CR · cs.AI

The Coding Limits of Robust Watermarking for Generative Models

Pith reviewed 2026-05-18 17:11 UTC · model grok-4.3

classification 💻 cs.CR cs.AI
keywords watermarkingtamper detectionrobustnessgenerative modelssymbol corruptioncoding limitssoundnessimage watermarking
0
0 comments X

The pith

Watermarking schemes cannot detect tampering if an adversary corrupts more than 1-1/q of the symbols.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a zero-bit tamper-detection code to capture the core requirements of robust watermarking for generative models. It proves that for any alphabet of size q, once independent symbol corruption exceeds the rate 1-1/q, no scheme satisfying soundness can reliably identify tampering. This limit holds even when allowing a constant false-positive probability on unmarked content. The threshold is shown to be tight through matching constructions that work for all lower rates. Experiments on existing image watermarking confirm the bound appears in practice when a crop-and-resize attack flips roughly half the latent signs and blocks decoding.

Core claim

In the zero-bit tamper-detection code abstraction, for an alphabet of size q there is a critical corruption rate of 1-1/q such that no scheme with soundness can reliably detect tampering once an adversary changes more than this fraction of symbols under independent corruption. This limit is unconditional and holds even when relaxing soundness to allow fixed constant false positive probability on random content. The threshold is tight, as information-theoretic constructions achieve soundness and tamper detection for all strictly smaller rates. In the binary case this means no cryptographic watermark remains robust if more than half the encoded bits are modified. Experiments on image watermark

What carries the argument

The zero-bit tamper-detection code, a secret-key procedure that samples a pseudorandom codeword and decides whether a candidate is unmarked content or the result of tampering.

Load-bearing premise

The adversary corrupts each symbol independently with some probability instead of using correlated or adaptive modifications across positions.

What would settle it

An explicit construction of a sound scheme that still detects tampering after independent corruption strictly above the 1-1/q rate, or a proof that every scheme below the rate fails on some independent corruption pattern.

Figures

Figures reproduced from arXiv: 2509.10577 by Daniele Venturi, Danilo Francati, Giuseppe Ateniese, Shubham Pawar, Yevin Nikhel Goonatilake.

Figure 1
Figure 1. Figure 1: A series of images showing various attacks on a watermarked image, none [PITH_FULL_IMAGE:figures/full_fig_p014_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Comparison between original and crop & resize attacked images. [PITH_FULL_IMAGE:figures/full_fig_p016_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Latent sign analysis of a watermarked image before and after the crop and resize [PITH_FULL_IMAGE:figures/full_fig_p017_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Visualization of this branch of attacks (b-e) compared to the original (a) [PITH_FULL_IMAGE:figures/full_fig_p018_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Side-by-side comparison of latent space visualizations for the main [PITH_FULL_IMAGE:figures/full_fig_p018_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Watermark removal on bordered image. 19 [PITH_FULL_IMAGE:figures/full_fig_p019_6.png] view at source ↗
read the original abstract

We study a basic question about cryptographic watermarking for generative models: how reliable can a watermark remain when an adversary is allowed to corrupt the encoded signal? To address this question, we introduce a minimal coding abstraction that we call a zero-bit tamper-detection code. This is a secret-key procedure that samples a pseudorandom codeword and, given a candidate word, decides whether it should be treated as unmarked content or as the result of tampering with a valid codeword. It captures the two core requirements of robust watermarking: soundness and tamper detection. Within this abstraction we prove a sharp unconditional limit on robustness to independent symbol corruption. For an alphabet of size $q$, there is a critical corruption rate of $1-1/q$ such that no scheme with soundness, even relaxed to allow a fixed constant false positive probability on random content, can reliably detect tampering once an adversary can change more than this fraction of symbols. In particular, in the binary case no cryptographic watermark can remain robust if more than half of the encoded bits are modified. We also show that this threshold is tight by giving simple information-theoretic constructions that achieve soundness and tamper detection for all strictly smaller corruption rates. We then test experimentally whether this limit appears in practice by looking at the recent watermarking for images of Gunn, Zhao, and Song (ICLR 2025). We show that a simple crop and resize operation reliably flipped about half of the latent signs and consistently prevented belief-propagation decoding from recovering the codeword, erasing the watermark while leaving the image visually intact.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The paper introduces zero-bit tamper-detection codes as a minimal abstraction capturing soundness and tamper detection for cryptographic watermarking of generative models. It proves an unconditional information-theoretic impossibility: for alphabet size q, no sound scheme (even allowing constant false-positive probability on random content) can reliably detect tampering under independent symbol corruption once the rate exceeds 1-1/q. Matching explicit constructions achieve the properties for all strictly lower rates. The binary case yields a 1/2 threshold. An experimental section applies crop-and-resize to the Gunn et al. (ICLR 2025) image watermarking scheme and shows it flips roughly half the latent signs, preventing belief-propagation decoding while leaving the image intact.

Significance. If the result holds, the work establishes a sharp, fundamental coding limit on robust watermarking against independent corruptions. Strengths include the unconditional proof, parameter-free constructions, and direct experimental test of the predicted threshold on a published scheme. The relaxation to constant false-positive rate and the clear statement of the independence assumption make the bound broadly applicable to future designs.

minor comments (2)
  1. The experimental description of the crop-resize attack would be strengthened by reporting the precise crop ratio, resize dimensions, and the exact fraction of sign flips observed across multiple images (rather than the approximate 'about half').
  2. In the construction section, a short pseudocode or explicit parameter settings for the information-theoretic encoder/decoder would improve clarity for readers implementing the schemes below the threshold.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive and accurate summary of our work, as well as for recommending acceptance. We are pleased that the unconditional information-theoretic bound, the tightness via explicit constructions, and the experimental test on the Gunn et al. scheme were viewed as strengths.

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained

full rationale

The paper establishes an information-theoretic impossibility result for zero-bit tamper-detection codes under independent symbol corruption, proving a sharp threshold of 1-1/q for alphabet size q using standard coding arguments. Matching constructions for rates below the threshold are given as explicit parameter-free schemes. No steps reduce by construction to fitted parameters, self-definitions, or load-bearing self-citations; the experimental illustration on an external scheme (Gunn et al.) is separate and does not affect the theoretical claim. The derivation relies on first-principles analysis of soundness and tamper detection without circular reductions.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the newly introduced abstraction plus standard cryptographic assumptions of pseudorandom sampling under a secret key; no free parameters are fitted to data.

axioms (1)
  • domain assumption Pseudorandom codeword sampling with secret key
    Invoked to ensure the codeword looks random to an adversary without the key.
invented entities (1)
  • zero-bit tamper-detection code no independent evidence
    purpose: Minimal secret-key procedure that samples a pseudorandom codeword and decides whether a candidate is unmarked or tampered
    Newly defined abstraction that captures soundness and tamper detection for the watermarking setting.

pith-pipeline@v0.9.0 · 5826 in / 1252 out tokens · 37696 ms · 2026-05-18T17:11:56.005772+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Removing the Watermark Is Not Enough: Forensic Stealth in Generative-AI Watermark Removal

    cs.CR 2026-05 unverdicted novelty 6.0

    Current AI image watermark removal attacks replace the watermark with a different forensic signal, allowing independent detectors to distinguish processed outputs from clean images at over 98% true-positive rate under...

  2. Fundamental Trade-Offs in Multi-Bit Watermarking of Stochastic Processes

    cs.IT 2026-05 unverdicted novelty 5.0

    Derives matched converse and achievability bounds that characterize optimal trade-offs among false-alarm probability, detection error probability, distortion, and information rate for multi-bit watermarking of station...

Reference graph

Works this paper leans on

37 extracted references · 37 canonical work pages · cited by 2 Pith papers

  1. [1]

    write newline

    " write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...

  2. [2]

    Ideal pseudorandom codes

    Omar Alrabiah, Prabhanjan Ananth, Miranda Christ, Yevgeniy Dodis, and Sam Gunn. Ideal pseudorandom codes. In Proceedings of the 57th Annual ACM Symposium on Theory of Computing , STOC '25, page 1638–1647, New York, NY, USA, 2025. Association for Computing Machinery

  3. [3]

    My AI safety lecture for UT effective altruism

    Scott Aaronson. My AI safety lecture for UT effective altruism. https://scottaaronson.blog/?p=6823, 2022. Discusses watermarking projects at OpenAI. Accessed: September 2025

  4. [4]

    Waves: benchmarking the robustness of image watermarks

    Bang An, Mucong Ding, Tahseen Rabbani, Aakriti Agrawal, Yuancheng Xu, Chenghao Deng, Sicheng Zhu, Abdirisak Mohamed, Yuxin Wen, Tom Goldstein, and Furong Huang. Waves: benchmarking the robustness of image watermarks. In Proceedings of the 41st International Conference on Machine Learning , ICML'24. JMLR.org, 2024

  5. [5]

    Atallah, Victor Raskin, Michael Crogan, Christian Hempelmann, Florian Kerschbaum, Dina Mohamed, and Sanket Naik

    Mikhail J. Atallah, Victor Raskin, Michael Crogan, Christian Hempelmann, Florian Kerschbaum, Dina Mohamed, and Sanket Naik. Natural language watermarking: Design, analysis, and a proof-of-concept implementation. In Ira S. Moskowitz, editor, Information Hiding , pages 185--200, Berlin, Heidelberg, 2001. Springer Berlin Heidelberg

  6. [6]

    Atallah, Victor Raskin, Christian F

    Mikhail J. Atallah, Victor Raskin, Christian F. Hempelmann, Mercan Karahan, Radu Sion, Umut Topkara, and Katrina E. Triezenberg. Natural language watermarking and tamperproofing. In Fabien A. P. Petitcolas, editor, Information Hiding , pages 196--212, Berlin, Heidelberg, 2003. Springer Berlin Heidelberg

  7. [7]

    Audio watermarking: Features, applications and algorithms

    Michael Arnold. Audio watermarking: Features, applications and algorithms. In 2000 IEEE International conference on multimedia and expo. ICME2000. Proceedings. Latest advances in the fast changing world of multimedia (cat. no. 00TH8532) , volume 2, pages 1013--1016. IEEE, 2000

  8. [8]

    Boland, J.J.K

    F.M. Boland, J.J.K. O'Ruanaidh, and C. Dautzenberg. Watermarking digital images for copyright protection. In Fifth International Conference on Image Processing and its Applications, 1995. , pages 326--330, 1995

  9. [9]

    Digital watermarks for audio signals

    Laurence Boney, Ahmed H Tewfik, and Khaled N Hamdy. Digital watermarks for audio signals. In Proceedings of the third IEEE international conference on multimedia computing and systems , pages 473--480. IEEE, 1996

  10. [10]

    Pseudorandom error-correcting codes

    Miranda Christ and Sam Gunn. Pseudorandom error-correcting codes. In Leonid Reyzin and Douglas Stebila, editors, CRYPTO 2024, Part VI , volume 14925 of LNCS , pages 325--347. Springer, Cham, August 2024

  11. [11]

    Revealing weaknesses in text watermarking through self-information rewrite attacks

    Yixin Cheng, Hongcheng Guo, Yangming Li, and Leonid Sigal. Revealing weaknesses in text watermarking through self-information rewrite attacks. In Forty-second International Conference on Machine Learning , 2025

  12. [12]

    Undetectable watermarks for language models

    Miranda Christ, Sam Gunn, and Or Zamir. Undetectable watermarks for language models. In Shipra Agrawal and Aaron Roth, editors, Proceedings of Thirty Seventh Conference on Learning Theory , volume 247 of Proceedings of Machine Learning Research , pages 1125--1139. PMLR, 30 Jun--03 Jul 2024

  13. [13]

    Digital Watermarking and Steganography

    Ingemar Cox, Matthew Miller, Jeffrey Bloom, Jessica Fridrich, and Ton Kalker. Digital Watermarking and Steganography . Morgan Kaufmann Publishers Inc., 2007

  14. [14]

    Regulation (EU) 2024/1689: Artificial Intelligence Act

    European Parliament and Council of the European Union . Regulation (EU) 2024/1689: Artificial Intelligence Act . https://eur-lex.europa.eu/eli/reg/2024/1689/oj/eng, 2024. Official Journal of the European Union, L 1689, 12 July 2024. Accessed: September 2025

  15. [15]

    Executive Order No

    Executive Office of the President of the United States . Executive Order No. 14110: Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence . https://www.federalregister.gov/documents/2023/11/01/2023-24283, 2023. Federal Register, Vol.\ 88, No.\ 212, pp.\ 75191--75212. Accessed: September 2025

  16. [16]

    The stable signature: Rooting watermarks in latent diffusion models

    Pierre Fernandez, Guillaume Couairon, Herv \'e J \'e gou, Matthijs Douze, and Teddy Furon. The stable signature: Rooting watermarks in latent diffusion models. In Proceedings of the IEEE/CVF International Conference on Computer Vision , pages 22466--22477, 2023

  17. [17]

    Publicly-detectable watermarking for language models

    Jaiden Fairoze, Sanjam Garg, Somesh Jha, Saeed Mahloujifar, Mohammad Mahmoody, and Mingyuan Wang. Publicly-detectable watermarking for language models. IACR Communications in Cryptology , 1(4), 2025

  18. [18]

    Supervised gan watermarking for intellectual property protection

    Jianwei Fei, Zhihua Xia, Benedetta Tondi, and Mauro Barni. Supervised gan watermarking for intellectual property protection. In 2022 IEEE International Workshop on Information Forensics and Security (WIFS) , pages 1--6, 2022

  19. [19]

    New constructions of pseudorandom codes

    Surendra Ghentiyala and Venkatesan Guruswami. New constructions of pseudorandom codes. arXiv preprint arXiv:2409.07580 , 2024

  20. [20]

    An undetectable watermark for generative image models

    Sam Gunn, Xuandong Zhao, and Dawn Song. An undetectable watermark for generative image models. In The Thirteenth International Conference on Learning Representations , 2025

  21. [21]

    Generating steganographic images via adversarial training

    Jamie Hayes and George Danezis. Generating steganographic images via adversarial training. In Proceedings of the 31st International Conference on Neural Information Processing Systems , NIPS'17, page 1951–1960, Red Hook, NY, USA, 2017. Curran Associates Inc

  22. [22]

    A watermark for large language models

    John Kirchenbauer, Jonas Geiping, Yuxin Wen, Jonathan Katz, Ian Miers, and Tom Goldstein. A watermark for large language models. In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett, editors, Proceedings of the 40th International Conference on Machine Learning , volume 202 of Proceedings of Machine Lear...

  23. [23]

    Unmarker: A universal attack on defensive image watermarking

    Andre Kassis and Urs Hengartner. Unmarker: A universal attack on defensive image watermarking. arXiv:2405.08363 , 2024. To appear at IEEE S&P 2025

  24. [24]

    Paraphrasing evades detectors of ai-generated text, but retrieval is an effective defense

    Kalpesh Krishna, Yixiao Song, Marzena Karpinska, John Wieting, and Mohit Iyyer. Paraphrasing evades detectors of ai-generated text, but retrieval is an effective defense. Advances in Neural Information Processing Systems , 36:27469--27500, 2023

  25. [25]

    Robust distortion-free watermarks for language models

    Rohith Kuditipudi, John Thickstun, Tatsunori Hashimoto, and Percy Liang. Robust distortion-free watermarks for language models. Transactions on Machine Learning Research , 2024

  26. [26]

    Leveraging optimization for adaptive attacks on image watermarks

    Nils Lukas, Abdulrahman Diaa, Lucas Fenaux, and Florian Kerschbaum. Leveraging optimization for adaptive attacks on image watermarks. In The Twelfth International Conference on Learning Representations , 2024

  27. [27]

    Rotation, scale and translation invariant digital image watermarking

    Joseph \'O Ruanaidh and Thierry Pun. Rotation, scale and translation invariant digital image watermarking. Proceedings of International Conference on Image Processing , 1:536--539 vol.1, 1997

  28. [28]

    Robustness of AI -image detectors: Fundamental limits and practical attacks

    Mehrdad Saberi, Vinu Sankar Sadasivan, Keivan Rezaei, Aounon Kumar, Atoosa Chegini, Wenxiao Wang, and Soheil Feizi. Robustness of AI -image detectors: Fundamental limits and practical attacks. In The Twelfth International Conference on Learning Representations , 2024

  29. [29]

    Tree-ring watermarks: Fingerprints for diffu- sion images that are invisible and robust

    Yuxin Wen, John Kirchenbauer, Jonas Geiping, and Tom Goldstein. Tree-ring watermarks: Fingerprints for diffusion images that are invisible and robust. arXiv preprint arXiv:2305.20030 , 2023

  30. [30]

    A comprehensive survey on robust image watermarking

    Wenbo Wan, Jun Wang, Yunming Zhang, Jing Li, Hui Yu, and Jiande Sun. A comprehensive survey on robust image watermarking. Neurocomputing , 488:226--247, 2022

  31. [31]

    Artificial fingerprinting for generative models: Rooting deepfake attribution in training data

    Ning Yu, Vladislav Skripniuk, Sahar Abdelnabi, and Mario Fritz. Artificial fingerprinting for generative models: Rooting deepfake attribution in training data. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV) , pages 14428--14437, 2021

  32. [32]

    Edelman, Danilo Francati, Daniele Venturi, Giuseppe Ateniese, and Boaz Barak

    Hanlin Zhang, Benjamin L Edelman, Danilo Francati, Daniele Venturi, Giuseppe Ateniese, and Boaz Barak. Watermarks in the sand: Impossibility of strong watermarking for generative models. arXiv preprint arXiv:2311.04378 , 2023

  33. [33]

    Watermarks in the sand: impossibility of strong watermarking for language models

    Hanlin Zhang, Benjamin L Edelman, Danilo Francati, Daniele Venturi, Giuseppe Ateniese, and Boaz Barak. Watermarks in the sand: impossibility of strong watermarking for language models. In Forty-first International Conference on Machine Learning , 2024

  34. [34]

    Hidden: Hiding data with deep networks

    Jiren Zhu, Russell Kaplan, Justin Johnson, and Li Fei-Fei. Hidden: Hiding data with deep networks. In Computer Vision – ECCV 2018: 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part XV , page 682–697, Berlin, Heidelberg, 2018. Springer-Verlag

  35. [35]

    A recipe for watermarking diffusion models

    Yunqing Zhao, Tianyu Pang, Chao Du, Xiao Yang, Ngai-Man Cheung, and Min Lin. A recipe for watermarking diffusion models. arXiv preprint arXiv:2303.10137 , 2023

  36. [36]

    Invisible image watermarks are provably removable using generative ai

    Xuandong Zhao, Kexun Zhang, Zihao Su, Saastha Vasan, Ilya Grishchenko, Christopher Kruegel, Giovanni Vigna, Yu-Xiang Wang, and Lei Li. Invisible image watermarks are provably removable using generative ai. In A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang, editors, Advances in Neural Information Processing Systems , volu...

  37. [37]

    Securing deep generative models with universal adversarial signature

    Yu Zeng, Mo Zhou, Yuan Xue, and Vishal M Patel. Securing deep generative models with universal adversarial signature. arXiv preprint arXiv:2305.16310 , 2023