arxiv: 2604.13786 · v1 · submitted 2026-04-15 · 💻 cs.CL

Recognition: unknown

QuantileMark: A Message-Symmetric Multi-bit Watermark for LLMs

Junlin Zhu , Baizhou Huang , Xiaojun Wan

Authors on Pith no claims yet

Pith reviewed 2026-05-10 14:06 UTC · model grok-4.3

classification 💻 cs.CL

keywords LLM watermarkingmulti-bit watermarkmessage symmetryquantile partitionprovenancecumulative distribution

0 comments

The pith

QuantileMark embeds multi-bit messages by partitioning the cumulative probability interval into equal-mass bins.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces QuantileMark to solve message-dependent bias in multi-bit LLM watermarking. Traditional vocabulary-partition methods assign unequal probability mass to different messages, which degrades text quality and detection accuracy for some messages in low-entropy contexts. QuantileMark instead divides the continuous [0,1) cumulative probability space into M equal-mass bins at every step and samples only inside the bin tied to the target message symbol. This keeps the probability budget fixed at 1/M for every message. The authors prove that averaging the generated distribution over messages recovers the original base model exactly, and they demonstrate stronger multi-bit recovery and detection robustness on C4 and LFQA tasks with almost no quality loss.

Core claim

QuantileMark embeds the message inside the quantile of the cumulative distribution function by partitioning [0,1) into M equal-mass bins and restricting the next-token sample to the bin indexed by the message. This construction guarantees message-unbiasedness: the expected token distribution conditioned on a uniform random message equals the unwatermarked model distribution. The equal-mass design further ensures that every message receives identical evidence strength during verification, which reconstructs the same partitions under teacher forcing and aggregates posterior bin probabilities.

What carries the argument

The equal-mass partition of the cumulative probability interval [0,1) into M bins, from which the model is forced to sample the token belonging to the message-assigned bin.

Load-bearing premise

The verifier can exactly reconstruct the same cumulative-probability partitions that were used at generation time, even when tokens were sampled rather than chosen by greedy decoding.

What would settle it

Measuring whether the average next-token distribution over a uniform sample of messages deviates from the base model's distribution on held-out prompts.

Figures

Figures reproduced from arXiv: 2604.13786 by Baizhou Huang, Junlin Zhu, Xiaojun Wan.

**Figure 2.** Figure 2: Overview of QuantileMark, where m = 2, H = 4. (a) Embedding: The key-derived logic (green) maps symbol to target bin Br ⋆ (blue frame). A random value u (red dashed line) is sampled uniformly within this bin, forcing the selection of the token is whose interval covers u. (b) Detection: Posteriors (blue bars) form the evidence LPO at this step. These scores are aggregated across steps assigned to position i… view at source ↗

**Figure 3.** Figure 3: Varying detection length T on Llama-3.1-8B-Instruct for LFQA. in Section 4.4. We report Bit Acc (bit-wise accuracy, averaged over 500 watermarked samples), detection AUC, TPR@1%FPR, and perplexity (PPL). 4.2 Generation and Detection Results [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: LFQA ablation of m for QuantileMark with 24 bits embedded. We vary the bits per symbol m ( M = 2m quantile bins). termark evaluations (Jiang et al., 2025): copypaste mixing with non-watermarked text, synonym substitution, random deletion, and paraphrasing [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: Detector-generator mismatch analysis on Llama-3.1-8B-Instruct. overlap masses and channel posteriors shift gradually rather than catastrophically. 4.5 Utility on Downstream Tasks Finally, we assess whether watermarking degrades downstream utility under top-k=128 sampling. For text summarization, we use BART-large (Lewis et al., 2020) evaluated on CNN/DailyMail (Hermann et al., 2015). For machine translat… view at source ↗

**Figure 6.** Figure 6: Performance comparison on LFQA with vary [PITH_FULL_IMAGE:figures/full_fig_p014_6.png] view at source ↗

read the original abstract

As large language models become standard backends for content generation, practical provenance increasingly requires multi-bit watermarking. In provider-internal deployments, a key requirement is message symmetry: the message itself should not systematically affect either text quality or verification outcomes. Vocabulary-partition watermarks can break message symmetry in low-entropy decoding: some messages are assigned most of the probability mass, while others are forced to use tail tokens. This makes embedding quality and message decoding accuracy message-dependent. We propose QuantileMark, a white-box multi-bit watermark that embeds messages within the continuous cumulative probability interval $[0, 1)$. At each step, QuantileMark partitions this interval into $M$ equal-mass bins and samples strictly from the bin assigned to the target symbol, ensuring a fixed $1/M$ probability budget regardless of context entropy. For detection, the verifier reconstructs the same partition under teacher forcing, computes posteriors over latent bins, and aggregates evidence for verification. We prove message-unbiasedness, a property ensuring that the base distribution is recovered when averaging over messages. This provides a theoretical foundation for generation-side symmetry, while the equal-mass design additionally promotes uniform evidence strength across messages on the detection side. Empirical results on C4 continuation and LFQA show improved multi-bit recovery and detection robustness over strong baselines, with negligible impact on generation quality. Our code is available at GitHub (https://github.com/zzzjunlin/QuantileMark).

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

QuantileMark gives a symmetric multi-bit LLM watermark by equal-mass binning of the cumulative probability interval, with a proof of message-unbiasedness that sidesteps the message-dependent quality issues in vocabulary partitioning.

read the letter

Hi colleague, The punchline on this paper is that it introduces a multi-bit watermark for LLMs that maintains symmetry across different messages by using equal-mass bins in the continuous cumulative probability interval. This avoids the problem where some messages get stuck with low-probability tokens in vocabulary-partition schemes. What the paper does well is present a clear construction: at each generation step, it divides [0,1) into M equal parts and samples from the bin corresponding to the message. They prove that this leads to message-unbiasedness, meaning the expected distribution over tokens is the same as the original model when you average across messages. That gives a solid theoretical basis for why generation quality shouldn't depend on the message. On the detection side, the equal-mass design helps with uniform evidence. The experiments on C4 and LFQA demonstrate better multi-bit recovery rates and robustness to attacks compared to baselines, and the impact on perplexity or quality metrics is small. Having the code public is a plus for reproducibility. The soft spots are not major. The reconstruction of the partitions during verification assumes teacher forcing works exactly, but as the math shows, since the partitions are a function of the prefix and the model's probabilities, it reproduces the same bins even if the original generation sampled tokens. No issue there. I might have liked to see more analysis on how it scales with sequence length or very high M, but the core argument seems sound. The citation pattern looks standard for the area. This paper is for folks working on watermarking and provenance tracking in LLM deployments, especially where you need to embed several bits of info without distorting the output distribution based on the message content. A reader looking for a practical, theoretically grounded alternative to existing multi-bit methods would find it useful. I would recommend putting it through peer review. The mechanism is novel enough and the claims are backed by both proof and experiments with code, so it merits a full review even if some details need tightening. Cheers,

Referee Report

0 major / 3 minor

Summary. The paper proposes QuantileMark, a white-box multi-bit watermark for LLMs that partitions the continuous cumulative probability interval [0,1) into M equal-mass bins at each step and samples from the bin corresponding to the target message. It proves message-unbiasedness (the base distribution is recovered when averaging over messages) and reports improved multi-bit recovery and detection robustness on C4 continuation and LFQA tasks relative to vocabulary-partition baselines, with negligible quality degradation. Code is released.

Significance. If the central claims hold, the work supplies a theoretically grounded alternative to vocabulary-partition watermarks that avoids message-dependent bias in low-entropy contexts. The equal-mass construction and the explicit proof of message-unbiasedness constitute a clear advance; the public code release further strengthens the contribution by enabling direct reproducibility.

minor comments (3)

[§3.1] §3.1: the description of how the verifier recomputes the exact same partition boundaries under teacher forcing would benefit from an explicit small example (e.g., a 3-token sequence with M=4) showing that the bin edges are identical for both greedy and sampled tokens.
[Table 2] Table 2: the reported AUC values for multi-bit detection would be easier to interpret if the number of bits per token and the exact false-positive rate threshold were stated in the caption rather than only in the text.
[§4.3] §4.3: the sentence claiming 'uniform evidence strength across messages' should be accompanied by a brief quantitative check (e.g., variance of per-message detection scores) to make the claim directly verifiable from the reported experiments.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary, recognition of the theoretical contribution (equal-mass partitioning and message-unbiasedness proof), and recommendation of minor revision. The referee correctly notes the advantages over vocabulary-partition baselines in low-entropy settings and the value of the public code release.

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper derives message-unbiasedness directly from the equal-mass partitioning of the [0,1) interval into M bins, each assigned exactly 1/M probability mass by construction. This ensures that averaging the modified distributions over all messages recovers the base distribution as a straightforward mathematical consequence of the uniform bin probabilities, independent of context entropy. The verifier reconstruction under teacher forcing follows identically because partitions are deterministic functions of the prefix and model output distribution alone. No equations reduce a claimed prediction to a fitted parameter, no self-citations bear load on the central property, and no ansatz or uniqueness theorem is imported. The derivation is self-contained against the stated construction.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The construction rests on standard properties of cumulative distribution functions and the assumption that the LLM's next-token distribution can be queried exactly during both generation and verification.

free parameters (1)

M
Number of equal-mass bins (determines bits per token); chosen by the user to set payload size.

axioms (1)

standard math The cumulative distribution function of any discrete distribution over tokens can be partitioned into M contiguous intervals each having total probability mass exactly 1/M.
Follows directly from the definition of a CDF and the fact that probability masses sum to 1.

pith-pipeline@v0.9.0 · 5561 in / 1296 out tokens · 56725 ms · 2026-05-10T14:06:47.787747+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

2 extracted references · 1 canonical work pages · 1 internal anchor

[1]

MirrorMark: Generalizable Mirrored Sampling for Multi-bit LLM Watermarking

Invisible entropy: Towards safe and efficient low-entropy LLM watermarking. InProceedings of the 2025 Conference on Empirical Methods in Natu- ral Language Processing, pages 6727–6744, Suzhou, China. Association for Computational Linguistics. Karl Moritz Hermann, Tomas Kocisky, Edward Grefen- stette, Lasse Espeholt, Will Kay, Mustafa Suleyman, and Phil Bl...

work page internal anchor Pith review Pith/arXiv arXiv 2025
[2]

coherence

Towards codable watermarking for injecting multi-bits information to LLMs. InThe Twelfth Inter- national Conference on Learning Representations. Yihan Wu, Zhengmian Hu, Hongyang Zhang, and Heng Huang. 2024. Dipmark: A stealthy, efficient and resilient watermark for large language models. In Proceedings of the 41st International Conference on Machine Learn...

2024