arxiv: 2604.10145 · v1 · submitted 2026-04-11 · 💻 cs.CR

Recognition: unknown

Mask-Free Privacy Extraction and Rewriting: A Domain-Aware Approach via Prototype Learning

Xiaodong Li , Yuhua Wang , Qingchen Yu , Zixuan Qin , Yifan Sun , Qinnan Zhang , Hainan Zhang , Zhiming Zheng

Authors on Pith no claims yet

Pith reviewed 2026-05-10 16:02 UTC · model grok-4.3

classification 💻 cs.CR

keywords privacy rewritingprototype learningcontrastive learningdifferential privacyLLM deploymentdomain adaptationmask-free extractionpreference alignment

0 comments

The pith

DAMPER learns domain prototypes via contrastive learning to localize and rewrite private text spans without masks or prompts.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tries to establish that contrastive learning on domain data can distill latent privacy semantics into compact prototypes that autonomously identify private spans in text, which then guide the creation of preference pairs for training a rewriting policy. This matters to a sympathetic reader because full-text rewriting distorts too much context while prompt-based or mask-based alternatives either leak private information or require impractical manual work. If the approach holds, LLMs could be deployed in sensitive domains with automated, domain-tuned privacy protection that adds formal differential privacy at the span level during inference. The method avoids human annotations entirely by using the prototypes as semantic anchors for both localization and alignment. Experiments reportedly show better privacy-utility balance than existing baselines.

Core claim

DAMPER operationalizes latent privacy semantics into compact Domain Privacy Prototypes via contrastive learning, enabling precise, autonomous span localization. Furthermore, Prototype-Guided Preference Alignment leverages the learned prototypes as semantic anchors to construct preference pairs, optimizing a domain-compliant rewriting policy without human annotations. At inference time, DAMPER integrates a sampling-based Exponential Mechanism to provide rigorous span-level Differential Privacy guarantees.

What carries the argument

Domain Privacy Prototypes, compact vectors learned via contrastive learning on domain data that anchor both span localization and the construction of preference pairs for rewriting policy optimization.

If this is right

Private span localization becomes fully autonomous and does not rely on unstable LLM instruction following or static dictionaries.
A rewriting policy can be optimized directly from prototype-derived preference pairs without any human-labeled data.
Span-level differential privacy is enforced at inference through the exponential mechanism sampling process.
The overall pipeline delivers a measurable improvement in the privacy-utility trade-off over full-text, mask-based, and prompt-based baselines on domain-specific tasks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The prototype approach could be extended to other alignment objectives, such as reducing toxicity or bias, by swapping the contrastive objective for the target attribute.
Separate prototype sets could be trained per subdomain and combined at inference to handle mixed-domain inputs.
If domain data is limited, the contrastive pre-training step might be replaced by few-shot or synthetic data augmentation while preserving the same downstream alignment logic.
One could test cross-domain transfer by training prototypes on medical notes and evaluating rewriting quality on legal documents to quantify generalization limits.

Load-bearing premise

Contrastive learning on domain data will produce prototypes that reliably separate private from non-private spans without excessive false positives or leakage, and the resulting preference pairs will yield a rewriting policy that generalizes beyond the training distribution.

What would settle it

Apply the learned prototypes to a held-out dataset containing manually annotated private spans and measure whether localization precision falls below the level needed to match or exceed baseline privacy-utility scores while satisfying the differential privacy guarantee.

Figures

Figures reproduced from arXiv: 2604.10145 by Hainan Zhang, Qingchen Yu, Qinnan Zhang, Xiaodong Li, Yifan Sun, Yuhua Wang, Zhiming Zheng, Zixuan Qin.

**Figure 1.** Figure 1: Challenge illustration. (1) Static repository matching requires distinct privacy libraries for each domain, incurring excessive storage and maintenance overheads. (2) Prompt-based localizations lack granularity, resulting in compromised performance: they indiscriminately mask generic terms (low Precision) while missing actual sensitive ones (suboptimal Recall). Conversely, DAMPER leverages domain prototy… view at source ↗

**Figure 2.** Figure 2: Overview of the DAMPER. (A) Offline Training Phase (Sec. 4.1): We first employ multi-domain contrastive learning to cluster annotated spans into compact Domain Privacy Prototypes (e.g., Pmed, Pleg). These prototypes guide DPO training for rewriter πθ using a composite reward to balance semantic obfuscation and domain fidelity. (B) Online Inference Phase (Sec. 4.2): The module segments user input via TextCh… view at source ↗

**Figure 3.** Figure 3: T-SNE visualization of span representations produced by backbone g(·) vs. span encoder h(·). • Negative set Nk,i comprises privacy spans from disparate domains (Sj , j ̸= k) and all nonprivacy spans (i.e., U \ S), which serve as hard negatives to enforce semantic discrimination. We minimize the Multi-positive InfoNCE loss to learn discriminative span representations: LCTR = − log P a∈Ak,i exp cos(zk,i, a)… view at source ↗

**Figure 4.** Figure 4: Hyper-parameter sensitivity of DAPPER on Pri-Mixture. Pri-Mixture LCT R LDP O ACC ↑ BS ↑ LLM-J ↑ 36.01 0.40 2.31 ✓ 69.07↑33.06 0.45↑0.05 4.21↑1.90 ✓ 66.07↑30.06 0.59↑0.19 4.12↑1.81 ✓ ✓ 78.29↑42.28 0.67↑0.27 4.97↑2.66 [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: Performance under different Top-% predefined privacy spans on Pri-DDXPlus. degradation appearing only below this threshold. This confirms that our learned prototypes capture abstract semantic regularities rather than relying on surface-level memorization, enabling effective detection extended to the long tail. 5.5 Privacy Attacks Assessment To rigorously evaluate the privacy robustness of DAMPER, we condu… view at source ↗

**Figure 6.** Figure 6: Results of privacy attacks on span-level rewrit [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

**Figure 7.** Figure 7: Computational cost comparison of different methods on Pri-DDXPlus and Pri-SLJA highest time cost, while DP-Paraphrase is the most time-efficient. DAMPER exhibits nearly identical runtime to DP-Prompt. Notably, DP-MLM and its variants exhibit a pronounced increase in runtime as the input text length grows, whereas the other methods remain largely insensitive to text length. This behavior arises from the tok… view at source ↗

**Figure 8.** Figure 8: Qualitative case study of medical-domain. Yellow marks ground-truth privacy. In rewritten text, Blue and Red denote TP and FP rewrites, respectively, while residual Yellow indicates FN. Indices [1]–[4] align spans across panels; a/b label FPs. A green check signifies a correct downstream prediction. a. nausea 3. involuntary eye movement 5. muscle spasms in neck 7. shortness of breath Detected Privacy Spans… view at source ↗

**Figure 9.** Figure 9: Qualitative case study of medical-domain. Yellow marks ground-truth privacy. In rewritten text, Blue and Red denote TP and FP rewrites, respectively, while residual Yellow indicates FN. Indices [1]–[4] align spans across panels; a/b label FPs. A green check signifies a correct downstream prediction [PITH_FULL_IMAGE:figures/full_fig_p023_9.png] view at source ↗

**Figure 10.** Figure 10: Qualitative case study of legal-domain. Yellow marks ground-truth privacy. In rewritten text, Blue and Red denote TP and FP rewrites, respectively, while residual Yellow indicates FN. Indices [1]–[4] align spans across panels; a/b label FPs. A green check signifies a correct downstream prediction. 1. personal rights 3. the victim was injured 5. defrauding property 2. constituting a joint crime 4. taking t… view at source ↗

**Figure 11.** Figure 11: Qualitative case study of legal-domain. Yellow marks ground-truth privacy. In rewritten text, Blue and Red denote TP and FP rewrites, respectively, while residual Yellow indicates FN. Indices [1]–[4] align spans across panels; a/b label FPs. A green check signifies a correct downstream prediction [PITH_FULL_IMAGE:figures/full_fig_p024_11.png] view at source ↗

**Figure 12.** Figure 12: Prompt template for candidate rewrite generation. [PITH_FULL_IMAGE:figures/full_fig_p025_12.png] view at source ↗

**Figure 13.** Figure 13: Prompt template for span-level rewriting. [PITH_FULL_IMAGE:figures/full_fig_p025_13.png] view at source ↗

**Figure 14.** Figure 14: Prompt template for accuracy evaluation. [PITH_FULL_IMAGE:figures/full_fig_p026_14.png] view at source ↗

**Figure 15.** Figure 15: Pri-DDXPlus prompt template for model generation. [PITH_FULL_IMAGE:figures/full_fig_p026_15.png] view at source ↗

**Figure 16.** Figure 16: Pri-SLJA prompt template for model generation. [PITH_FULL_IMAGE:figures/full_fig_p026_16.png] view at source ↗

**Figure 17.** Figure 17: Pri-DDXPlus prompt template for LLM-J evaluation. [PITH_FULL_IMAGE:figures/full_fig_p027_17.png] view at source ↗

**Figure 18.** Figure 18: Pri-SLJA prompt template for LLM-J evaluation. [PITH_FULL_IMAGE:figures/full_fig_p028_18.png] view at source ↗

**Figure 19.** Figure 19: Prompt template for prompt injection attack. [PITH_FULL_IMAGE:figures/full_fig_p029_19.png] view at source ↗

**Figure 20.** Figure 20: Pri-DDXPlus prompt template for prompt-based zero-shot privacy localization. [PITH_FULL_IMAGE:figures/full_fig_p029_20.png] view at source ↗

**Figure 21.** Figure 21: Pri-SLJA prompt template for prompt-based zero-shot privacy localization. [PITH_FULL_IMAGE:figures/full_fig_p030_21.png] view at source ↗

read the original abstract

Client-side privacy rewriting is crucial for deploying LLMs in privacy-sensitive domains. However, existing approaches struggle to balance privacy and utility. Full-text methods often distort context, while span-level approaches rely on impractical manual masks or brittle static dictionaries. Attempts to automate localization via prompt-based LLMs prove unreliable, as they suffer from unstable instruction following that leads to privacy leakage and excessive context scrubbing. To address these limitations, we propose DAMPER (Domain-Aware Mask-free Privacy Extraction and Rewriting). DAMPER operationalizes latent privacy semantics into compact Domain Privacy Prototypes via contrastive learning, enabling precise, autonomous span localization. Furthermore, we introduce a Prototype-Guided Preference Alignment, which leverages learned prototypes as semantic anchors to construct preference pairs, optimizing a domain-compliant rewriting policy without human annotations. At inference time, DAMPER integrates a sampling-based Exponential Mechanism to provide rigorous span-level Differential Privacy (DP) guarantees. Extensive experiments demonstrate that DAMPER significantly outperforms existing baselines, achieving a superior privacy-utility trade-off.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

DAMPER proposes domain prototypes via contrastive learning for mask-free privacy localization and rewriting, but the superiority claims rest on thin evidence.

read the letter

The main point is that this paper introduces DAMPER, a pipeline that learns compact domain privacy prototypes through contrastive learning to find private spans without any masks or prompts. It then uses those prototypes to build preference pairs for alignment training and applies sampling-based exponential mechanism DP at inference. The combination targets the practical gap between full-text distortion methods and brittle manual-span approaches in client-side LLM privacy rewriting. That framing and the end-to-end automation without human labels are the clearest advances. The domain-aware prototypes as semantic anchors for both localization and policy optimization is a reasonable way to make the system adapt to specific fields. The abstract states that experiments show better privacy-utility trade-offs than baselines, which at least indicates the authors ran comparisons. The central risk is whether the contrastive stage actually produces reliable separation. If the learned embeddings mix private and non-private content or require overly specific negatives, localization will either leak or scrub too much, and the downstream preference alignment will train on bad pairs. The abstract supplies no equations, pair-construction details, or ablation numbers on prototype quality, so that assumption stays under-supported. The paper is aimed at people working on regulated LLM deployments where manual annotation is expensive. It shows coherent engagement with the cited limitations of prior work and deserves peer review so referees can check the metrics, datasets, and whether the prototype separation holds on the actual data. I would send it out rather than desk reject, but with a request for full quantitative results and controls on the contrastive component.

Referee Report

2 major / 0 minor

Summary. The manuscript proposes DAMPER, a domain-aware mask-free method for privacy extraction and rewriting in LLMs. It learns compact Domain Privacy Prototypes via contrastive learning to autonomously localize private spans, then uses Prototype-Guided Preference Alignment to construct preference pairs and optimize a domain-compliant rewriting policy without human annotations. At inference, it applies a sampling-based Exponential Mechanism for span-level differential privacy guarantees. The central claim is that DAMPER significantly outperforms existing baselines and achieves a superior privacy-utility trade-off.

Significance. If the empirical claims hold with rigorous quantitative support, the work could enable more practical client-side privacy preservation for LLMs in sensitive domains by removing reliance on manual masks, static dictionaries, or unstable prompt-based localization, while providing formal DP guarantees.

major comments (2)

Abstract: the claim that DAMPER 'significantly outperforms existing baselines' and achieves a 'superior privacy-utility trade-off' is presented without any quantitative metrics, baseline names, dataset descriptions, or ablation results; the supporting evidence is described only as 'extensive experiments' and 'qualitative experiment summaries.' This is load-bearing for the central claim and prevents verification of the asserted superiority.
The separation assumption underlying Domain Privacy Prototypes (learned via contrastive learning) is load-bearing: the manuscript provides no equations, sampling details, or hard-negative construction strategy for the contrastive stage. If the learned embeddings fail to tightly cluster private semantics separately from non-private ones, both span localization and the downstream Prototype-Guided Preference Alignment will operate on noisy pairs, directly undermining the claimed privacy-utility improvement.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed feedback on our manuscript. We have carefully reviewed the major comments and provide point-by-point responses below, outlining specific revisions to address the concerns while preserving the core contributions of DAMPER.

read point-by-point responses

Referee: Abstract: the claim that DAMPER 'significantly outperforms existing baselines' and achieves a 'superior privacy-utility trade-off' is presented without any quantitative metrics, baseline names, dataset descriptions, or ablation results; the supporting evidence is described only as 'extensive experiments' and 'qualitative experiment summaries.' This is load-bearing for the central claim and prevents verification of the asserted superiority.

Authors: We agree that the abstract would be strengthened by including concrete quantitative support for the central claims. In the revised manuscript, we will add specific metrics from our experiments (e.g., relative improvements in privacy leakage reduction and utility preservation on named datasets such as medical and legal corpora, compared to baselines including prompt-based localization and static dictionary methods). This will allow immediate verification of the privacy-utility trade-off without requiring readers to consult the full experimental section. revision: yes
Referee: The separation assumption underlying Domain Privacy Prototypes (learned via contrastive learning) is load-bearing: the manuscript provides no equations, sampling details, or hard-negative construction strategy for the contrastive stage. If the learned embeddings fail to tightly cluster private semantics separately from non-private ones, both span localization and the downstream Prototype-Guided Preference Alignment will operate on noisy pairs, directly undermining the claimed privacy-utility improvement.

Authors: We acknowledge that the contrastive learning stage for Domain Privacy Prototypes requires fuller technical detail to substantiate the separation assumption. In the revision, we will include the complete contrastive loss formulation, the positive/negative pair sampling procedure (including domain-specific batch construction), and the hard-negative mining strategy based on embedding similarity thresholds. These additions will clarify how private semantics are isolated from non-private ones, directly supporting the reliability of span localization and preference alignment. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper introduces DAMPER as a new pipeline combining contrastive learning for Domain Privacy Prototypes with prototype-guided preference alignment and sampling-based DP at inference. No equations, derivations, or self-citations are shown that reduce the claimed privacy-utility superiority to fitted inputs by construction, self-definitional loops, or renamed known results. Performance claims rest on external experimental benchmarks against baselines rather than internal reductions; the contrastive and preference objectives are standard losses applied to domain data without the target metric being presupposed in the fitting process. The derivation chain remains self-contained and falsifiable via held-out evaluation.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 1 invented entities

The method introduces domain privacy prototypes as learned entities and relies on standard assumptions of contrastive learning and preference optimization; no explicit free parameters or axioms are detailed in the abstract.

invented entities (1)

Domain Privacy Prototypes no independent evidence
purpose: Compact representations of latent privacy semantics for autonomous span localization
Operationalized via contrastive learning to replace manual masks or static dictionaries

pith-pipeline@v0.9.0 · 5502 in / 1138 out tokens · 61151 ms · 2026-05-10T16:02:26.526216+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

32 extracted references · 2 canonical work pages · 1 internal anchor

[1]

arXiv preprint

Scaling instruction-finetuned language models. arXiv preprint. DeepSeek-AI. 2025. Deepseek-v3.2: Pushing the fron- tier of open large language models. Arthur P Dempster, Nan M Laird, and Donald B Rubin

2025
[2]

RoBERTa: A Robustly Optimized BERT Pretraining Approach

Maximum likelihood from incomplete data via the em algorithm.Journal of the Royal Statistical Society: Series B (Methodological), 39(1):1–22. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. Bert: Pre-training of deep bidirectional transformers for language understand- ing. InProceedings of the 2019 conference of the North American ...

work page internal anchor Pith review Pith/arXiv arXiv 2019
[3]

Saquib Sarfraz, Vivek Sharma, and Rainer Stiefelhagen

Direct preference optimization: Your language model is secretly a reward model.arXiv preprint. Saquib Sarfraz, Vivek Sharma, and Rainer Stiefelhagen
[4]

InProceedings of the IEEE/CVF conference on computer vision and pattern recogni- tion, pages 8934–8943

Efficient parameter-free clustering using first neighbor relations. InProceedings of the IEEE/CVF conference on computer vision and pattern recogni- tion, pages 8934–8943. Weiyan Shi, Ryan Shea, Si Chen, Chiyuan Zhang, Ruoxi Jia, and Zhou Yu. 2022. Just fine-tune twice: Selec- tive differential privacy for large language models. InProceedings of the 2022 ...

work page arXiv 2022
[5]

Yaowei Zheng, Richong Zhang, Junhao Zhang, Yanhan Ye, Zheyan Luo, Zhangchi Feng, and Yongqiang Ma

Judging llm-as-a-judge with mt-bench and chatbot arena.Advances in neural information pro- cessing systems, 36:46595–46623. Yaowei Zheng, Richong Zhang, Junhao Zhang, Yanhan Ye, Zheyan Luo, Zhangchi Feng, and Yongqiang Ma
[6]

InProceedings of the 62nd Annual Meeting of the Association for Compu- tational Linguistics (Volume 3: System Demonstra- tions), Bangkok, Thailand

Llamafactory: Unified efficient fine-tuning of 100+ language models. InProceedings of the 62nd Annual Meeting of the Association for Compu- tational Linguistics (Volume 3: System Demonstra- tions), Bangkok, Thailand. Association for Computa- tional Linguistics. Appendices A Notations 12 B Related Work 12 B.1 Token-level LDP Substitution . . . 12 B.2 Seque...
[7]

Early mechanisms perturb each word independently and sample substitutes under metric-LDP constraints

has been widely used to privatize text at the token level. Early mechanisms perturb each word independently and sample substitutes under metric-LDP constraints. SanText (Yue et al., 2021) selects replacements according to embedding dis- tance to provide provable token-level guarantees, while CusText (Chen et al., 2023a) restricts the can- didate neighborh...

2021
[8]

Recent work explores strategies that combine multiple granularities

proposes the 1-Diffractor mechanism to gen- erate multiple DP-compliant candidates and select high-quality realizations. Recent work explores strategies that combine multiple granularities. DP- GTR (Li et al., 2025), for example, generates mul- tiple DP paraphrases and identifies consensus key- words that consistently survive privatization for iterative s...

2025
[9]

or FINCH (Sarfraz et al., 2019)) to serve as structured, discrete summaries of the underlying data distribution. C.4 Direct Preference Optimization Preference alignment aims to steer language mod- els toward desired behaviors using relative feed- back rather than absolute rewards (Christiano et al., 2017; Ouyang et al., 2022; Bai et al., 2022). Direct Pre...

2019
[10]

optimizes a policy πθ directly from a dataset of preference pairs (x, yw, yl), where yw is pre- ferred over yl. Unlike Reinforcement Learning from Human Feedback (RLHF) (Ouyang et al., 2022), which requires training a separate reward model, DPO derives a closed-form objective by im- plicitly defining the reward via the ratio of the pol- icy likelihood to ...

2022
[11]

taking the victim by force

based on the rewritten queries, and then leverage DeepSeek-V3 (DeepSeek-AI, 2025) as an evaluator to assign a quality score on a scale of 1 to 10. The prompt templates used for this evaluation are provided in Appendix I.4. Semantic Obfuscation Index (SOI ↑) and Do- main Fidelity Score (DFS ↑).These dual metrics quantify the effectiveness of the rewriting ...

2025
[12]

the symptoms of burning sensation in throat
[13]

symptoms worse after eating
[14]

pain Detected Privacy Spans (Ours)
[15]

burning sensation in throat b. cough
[16]

The woman presents throat discomfort[3], general unease[4], cough, worsening after meals[5]

pain Detected Privacy Spans (Prompt-based) Rewritten (Ours) A woman has a history of a prior condition[1], weight-related issues[2]. The woman presents throat discomfort[3], general unease[4], cough, worsening after meals[5]. Rewritten (Prompt-based) A woman has a history of pregnancy[1], obesity[2]. The woman presents throat issues[3], aches[4], breathin...
[17]

involuntary eye movement
[19]

shortness of breath Detected Privacy Spans (Ours)
[20]

muscle spasms

involuntary eye movement b. muscle spasms
[21]

muscle spasms in neck
[22]

shortness of breath Detected Privacy Spans (Prompt-based) Gold Label Acute dystonic reactions
[23]

antipsychotic medication usage
[24]

The male presents visual irregularities[3], facial discomfort[4], neck tension[5], drooping eyelids[6], breathing difficulty[7]

ptosis Rewritten (Ours) A male has a history of prior medication exposure[1], stomach discomfort[a], substance exposure[2]. The male presents visual irregularities[3], facial discomfort[4], neck tension[5], drooping eyelids[6], breathing difficulty[7]. Rewritten (Prompt-based) A male has a history of antipsychotic medication usage[1], nausea, stimulant dr...
[25]

intention and negligence Detected Privacy Spans (Ours)

selling fake drugs a. intention and negligence Detected Privacy Spans (Ours)
[26]

physical health rights and life safety b. state's drug management system Detected Privacy Spans (Prompt-based) Rewritten (Ours) Rewritten (Prompt-based) Inference (Ours) Inference (Prompt-based) Gold Label Production and sale of counterfeit drugs Original Text ＆ Ground-truth Spans The social relationships protected by criminal law and infringed upon by cr...
[27]

physical health rights and life safety
[28]

criminal acts is public safety interests[1] state's drug

to seek illegal benefits an intentional The social ... criminal acts is public safety interests[1] state's drug ... a crime contains distributing questionable products [2], improper gains[3] [a, selling Qufeng ... Consisted of mixed intent ] the ... its results is personal gain motives[4]. The social ... criminal acts is public well-being[1] and regulator...
[29]

the victim was injured
[30]

constituting a joint crime
[31]

taking the victim by force
[32]

beating the victim Detected Privacy Spans (Ours)
[33]

span1","span2

personal rights 2. constituting a joint crime and a person with full capacity for conduct Detected Privacy Spans (Prompt-based) Rewritten (Ours) Rewritten (Prompt-based) Inference (Ours) Inference (Prompt-based) Gold Label Illegal Detention Original Text ＆ Ground-truth Spans The social relationships protected by criminal law and infringed upon by criminal...