pith. machine review for the scientific record. sign in

arxiv: 2604.16521 · v1 · submitted 2026-04-16 · 💻 cs.CR · cs.AI

Recognition: unknown

CAMP: Cumulative Agentic Masking and Pruning for Privacy Protection in Multi-Turn LLM Conversations

Authors on Pith no claims yet

Pith reviewed 2026-05-10 11:33 UTC · model grok-4.3

classification 💻 cs.CR cs.AI
keywords cumulative PII exposuremulti-turn conversationsprivacy protectionLLM maskingagentic systemsretroactive pruningsession registry
0
0 comments X

The pith

CAMP tracks PII across conversation turns with a registry and risk graph to neutralize cumulative exposure while keeping utility intact.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that per-turn PII masking leaves multi-turn LLM conversations vulnerable because users can disclose fragments like name, job, location, and health details separately, allowing full re-identification even when no single message meets the threshold. It formalizes this risk as Cumulative PII Exposure and introduces CAMP to maintain a session registry, build a co-occurrence graph of entity combinations, calculate an exposure score after every turn, and retroactively mask earlier history once the score exceeds a set limit. A sympathetic reader would care because agentic LLM use routinely involves extended back-and-forth exchanges where privacy risks compound naturally, yet existing tools treat messages in isolation and provide no cross-turn protection.

Core claim

CAMP maintains a session-level PII registry, constructs a co-occurrence graph to model combination risk between entity types, computes a CPE score after each turn, and triggers retroactive masking of conversation history when the score crosses a configurable threshold. We evaluate CAMP on four synthetic multi-turn scenarios spanning healthcare, hiring, finance, and general conversation, demonstrating that per-turn baselines expose re-identifiable profiles that CAMP successfully neutralizes while preserving full conversational utility.

What carries the argument

The CPE score, derived from a session-level PII registry and co-occurrence graph of entity types, which decides when to apply retroactive masking to prior turns.

If this is right

  • Per-turn baselines leave users exposed to profile reconstruction from scattered disclosures across turns.
  • CAMP's threshold-triggered masking can prevent re-identification in extended conversations without requiring changes to the underlying LLM.
  • The approach applies across domains such as healthcare, finance, and hiring where piecemeal PII release is common.
  • Configurable thresholds allow tuning between privacy strength and minimal intervention in the conversation history.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Deploying CAMP could require new session management layers in chat interfaces to store and update the PII registry securely.
  • The method might interact with context-window limits, since masking shortens effective history and could force models to rely more on summarized state.
  • Real-world testing with actual user logs, rather than synthetic scenarios, would be needed to measure how often the co-occurrence graph triggers masking in practice.
  • CAMP could combine with existing single-turn detectors to create a layered system that addresses both immediate and cumulative risks.

Load-bearing premise

That the co-occurrence graph accurately models real-world combination risks between entity types and that retroactive masking preserves full conversational utility without disrupting model context or introducing new errors.

What would settle it

Running the same multi-turn healthcare or hiring scenario with CAMP active and checking whether the model can still produce coherent, contextually accurate responses to follow-up questions after the retroactive masking step, while a manual audit confirms no re-identifiable profile remains in the visible history.

Figures

Figures reproduced from arXiv: 2604.16521 by Aman Panjwani.

Figure 3
Figure 3. Figure 3: CAMP intervention turn across all scenarios under three thresh [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 2
Figure 2. Figure 2: Co-occurrence graph evolution across turns in S3 Finance. Nodes [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
read the original abstract

The deployment of Large Language Models in agentic, multi-turn conversational settings has introduced a class of privacy vulnerabilities that existing protection mechanisms are not designed to address. Current approaches to Personally Identifiable Information (PII) masking operate on a per-turn basis, scanning each user message in isolation and replacing detected entities with typed placeholders before forwarding sanitized text to the model. While effective against direct identifier leakage within a single message, these methods are fundamentally stateless and fail to account for the compounding privacy risk that emerges when PII fragments accumulate across conversation turns. A user who separately discloses their name, employer, location, and medical condition across several messages has revealed a fully re-identifiable profile - yet no individual message would trigger a per-turn masker. We formalize this phenomenon as Cumulative PII Exposure (CPE) and propose CAMP (Cumulative Agentic Masking and Pruning), a cross-turn privacy protection framework for multi-turn LLM conversations. CAMP maintains a session-level PII registry, constructs a co-occurrence graph to model combination risk between entity types, computes a CPE score after each turn, and triggers retroactive masking of conversation history when the score crosses a configurable threshold. We evaluate CAMP on four synthetic multi-turn scenarios spanning healthcare, hiring, finance, and general conversation, demonstrating that per-turn baselines expose re-identifiable profiles that CAMP successfully neutralizes while preserving full conversational utility.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims that per-turn PII masking is insufficient for multi-turn LLM conversations because PII fragments can accumulate into re-identifiable profiles across turns. It formalizes this as Cumulative PII Exposure (CPE), introduces CAMP (which maintains a session-level PII registry, builds a co-occurrence graph of entity-type risks, computes a CPE score per turn, and applies retroactive masking/pruning when a configurable threshold is crossed), and reports that CAMP neutralizes re-identification risks on four synthetic scenarios (healthcare, hiring, finance, general) while preserving full conversational utility, unlike stateless baselines.

Significance. If the central claims hold, CAMP addresses a genuine gap in privacy mechanisms for agentic, multi-turn LLM use cases. The formalization of CPE and the retroactive, graph-based triggering mechanism represent a concrete engineering contribution that could be adopted in production safety layers. The paper ships a clear algorithmic description and synthetic test harness, which are positive for reproducibility.

major comments (2)
  1. [Evaluation] Evaluation section: the manuscript reports only that CAMP 'successfully neutralizes' re-identifiable profiles and 'preserves full conversational utility' on four synthetic scenarios, but provides no numerical CPE scores, no exact formula or pseudocode for CPE computation, no description of co-occurrence-graph construction (data source, edge-weighting, or validation against external PII co-occurrence statistics), and no quantitative utility metrics (e.g., downstream task accuracy, coherence scores, or human ratings before/after masking). These omissions make the neutralization and utility-preservation claims impossible to assess for robustness.
  2. [§3] §3 (CAMP framework): the claim that the co-occurrence graph 'models combination risk between entity types' is load-bearing for the CPE score and masking trigger, yet the text gives no concrete construction details or justification that the graph reflects real-world adversary re-identification power rather than ad-hoc assumptions. Without this, the threshold-based retroactive masking cannot be shown to be more than heuristic.
minor comments (2)
  1. [Introduction] The abstract and introduction cite 'per-turn baselines' but do not name or reference the specific masking implementations used for comparison; adding these citations would improve clarity.
  2. [§3] Notation for the CPE score and threshold parameter is introduced without an explicit equation; a numbered equation would aid readers.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive feedback and the recommendation for major revision. We address each major comment below and will incorporate the requested details and clarifications in the revised manuscript to strengthen the evaluation and framework description.

read point-by-point responses
  1. Referee: [Evaluation] Evaluation section: the manuscript reports only that CAMP 'successfully neutralizes' re-identifiable profiles and 'preserves full conversational utility' on four synthetic scenarios, but provides no numerical CPE scores, no exact formula or pseudocode for CPE computation, no description of co-occurrence-graph construction (data source, edge-weighting, or validation against external PII co-occurrence statistics), and no quantitative utility metrics (e.g., downstream task accuracy, coherence scores, or human ratings before/after masking). These omissions make the neutralization and utility-preservation claims impossible to assess for robustness.

    Authors: We concur that the evaluation section would benefit from more quantitative details to allow readers to assess the robustness of our claims. In the revised manuscript, we will include the numerical CPE scores computed for each turn in the four synthetic scenarios. We will also provide the precise formula used for the CPE score and accompanying pseudocode. For the co-occurrence graph, we will detail its construction process, including the data sources (synthetic scenario definitions), edge-weighting methodology based on entity-type combination risks, and note the absence of external validation as the evaluation is synthetic. Additionally, we will report quantitative utility metrics, such as task-specific accuracy and automated coherence scores, comparing conversations with and without CAMP intervention. revision: yes

  2. Referee: [§3] §3 (CAMP framework): the claim that the co-occurrence graph 'models combination risk between entity types' is load-bearing for the CPE score and masking trigger, yet the text gives no concrete construction details or justification that the graph reflects real-world adversary re-identification power rather than ad-hoc assumptions. Without this, the threshold-based retroactive masking cannot be shown to be more than heuristic.

    Authors: We recognize that Section 3 provides an overview rather than exhaustive implementation specifics. We will revise this section to include concrete details on how the co-occurrence graph is constructed from the entity types present in the conversation history, with explicit edge weights assigned according to predefined risk levels for combinations (e.g., name + location + medical condition). We will justify these weights by referencing established privacy risks in the literature on re-identification attacks. While the graph in our experiments is tailored to the synthetic scenarios to illustrate the CPE mechanism, we will emphasize its role as a modular component that can incorporate real-world co-occurrence data when available. This will clarify that the retroactive masking is driven by a principled, albeit configurable, risk model rather than purely ad-hoc rules. revision: yes

Circularity Check

0 steps flagged

No significant circularity: CAMP is a self-contained novel construction

full rationale

The paper defines Cumulative PII Exposure (CPE) as a new formalization of compounding privacy risk across turns and introduces CAMP as a framework that maintains a session-level PII registry, builds a co-occurrence graph, computes a CPE score, and applies retroactive masking at a threshold. No equations, derivations, or load-bearing steps reduce the central claims to fitted parameters, self-referential definitions, or prior self-citations. The co-occurrence graph and CPE scoring are presented as internal components of the proposed system rather than outputs derived from the results they enable. Evaluation on synthetic scenarios is described but does not involve renaming known results or smuggling ansatzes via citation. The derivation chain remains independent of its own outputs.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 2 invented entities

The central claim depends on the validity of the new CPE score and graph model, which are not derived from first principles or external data but postulated for this framework.

free parameters (1)
  • CPE threshold
    Configurable value that determines when to trigger retroactive masking based on the computed CPE score.
axioms (1)
  • domain assumption Accurate detection and typing of PII entities in individual messages is possible
    The framework relies on per-turn PII detection to populate the session-level registry.
invented entities (2)
  • Cumulative PII Exposure (CPE) score no independent evidence
    purpose: To quantify the accumulating risk of re-identification from PII fragments across turns
    Newly defined metric without reference to prior validated measures.
  • Co-occurrence graph no independent evidence
    purpose: To model risks from combinations of different entity types
    Introduced structure for assessing combination privacy risks.

pith-pipeline@v0.9.0 · 5544 in / 1520 out tokens · 31124 ms · 2026-05-10T11:33:07.466476+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

20 extracted references · 8 canonical work pages

  1. [1]

    Building effective agents,

    Anthropic, “Building effective agents,” https://www.anthropic.com/research/building-effective-agents, 2024

  2. [2]

    LangChain: Building applications with LLMs through composability,

    H. Chase, “LangChain: Building applications with LLMs through composability,” 2023

  3. [3]

    Presidio: Data protection and de-identification SDK,

    Microsoft, “Presidio: Data protection and de-identification SDK,” https://github.com/microsoft/presidio, 2021

  4. [4]

    PAPILLON: Sequentially-interactive privacy protection for large language model pipelines,

    S. Siyanet al., “PAPILLON: Sequentially-interactive privacy protection for large language model pipelines,” inProceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2024

  5. [5]

    Privacy as contextual integrity,

    H. Nissenbaum, “Privacy as contextual integrity,” inWashington Law Review, vol. 79, no. 1, 2004, pp. 119–158

  6. [6]

    Anonymisation models for text data: State of the art, challenges and future directions,

    P. Lison, I. Pil ´an, D. Sanchez, M. Batet, and L. Øvrelid, “Anonymisation models for text data: State of the art, challenges and future directions,” inProceedings of the 59th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2021, pp. 4188–4203

  7. [7]

    Analyzing leakage of personally identifiable information in language models,

    N. Lukas, A. Salem, R. Sim, S. Tople, L. Wutschitz, and S. Zanella-B ´eguelin, “Analyzing leakage of personally identifiable information in language models,”IEEE Symposium on Security and Privacy, 2023

  8. [8]

    Adaptive PII mitigation framework for large language models,

    S. Yanamalaet al., “Adaptive PII mitigation framework for large language models,”arXiv preprint arXiv:2501.12465, 2025

  9. [9]

    Truthful text sanitization guided by inference attacks,

    B. Manzanares-Salor, D. Sanchez, and P. Lison, “Truthful text sanitization guided by inference attacks,”arXiv preprint arXiv:2412.12928, 2024

  10. [10]

    Extracting training data from large language models,

    N. Carlini, F. Tramer, E. Wallace, M. Jagielski, A. Herbert-V oss, K. Lee, A. Roberts, T. Brown, D. Song, U. Erlingsson, A. Oprea, and C. Raffel, “Extracting training data from large language models,” inProceedings of the 30th USENIX Security Symposium, 2021, pp. 2633–2650

  11. [11]

    Feder Cooper, Daphne Ippolito, Christopher A

    M. Nasr, N. Carlini, J. Hayase, M. Jagielski, A. F. Cooper, D. Ippolito, C. A. Choquette-Choo, E. Wallace, F. Tram `er, and K. Lee, “Scalable extraction of training data from (production) language models,”arXiv preprint arXiv:2311.17035, 2023

  12. [12]

    Beyond memorization: Violating privacy via inference with large language models,

    R. Staab, M. Vero, M. Balunovic, and M. Vechev, “Beyond memorization: Violating privacy via inference with large language models,” inInternational Conference on Learning Representations, 2024

  13. [13]

    PII-Scope: A comprehensive study on training data PII extraction attacks in LLMs,

    K. K. Nakka, A. Frikha, R. Mendes, X. Jiang, and X. Zhou, “PII-Scope: A comprehensive study on training data PII extraction attacks in LLMs,”arXiv preprint arXiv:2410.06704, 2024

  14. [14]

    Adversarial anonymization of text via feedback-guided rewriting,

    R. Staab, M. Vero, M. Balunovic, and M. Vechev, “Adversarial anonymization of text via feedback-guided rewriting,” inICLR Workshop on Reliable and Responsible F oundation Models, 2024

  15. [15]

    Self-refining language model anonymizers via adversarial distillation,

    Authors, “Self-refining language model anonymizers via adversarial distillation,”arXiv preprint arXiv:2506.01420, 2025

  16. [16]

    Understanding PII leakage in large language models: A systematic survey,

    S. Siyanet al., “Understanding PII leakage in large language models: A systematic survey,”Proceedings of IJCAI, 2025

  17. [17]

    Robust utility-preserving text anonymization based on LLM rewriting,

    Authors, “Robust utility-preserving text anonymization based on LLM rewriting,”Proceedings of ACL, 2025

  18. [18]

    AgentLeak : A full-stack benchmark for privacy leakage in multi-agent LLM systems

    A. Authors, “AgentLeak: A full-stack benchmark for privacy leakage in multi-agent LLM systems,”arXiv preprint arXiv:2602.11510, 2026

  19. [19]

    Privacy guard and token parsimony by prompt and context handling and LLM routing,

    Authors, “Privacy guard and token parsimony by prompt and context handling and LLM routing,”arXiv preprint arXiv:2603.28972, 2025

  20. [20]

    Can llms keep a secret? testing privacy implications of language models via contextual integrity theory,

    F. Mireshghallahet al., “Can LLMs keep a secret? testing privacy implications of language models via contextual integrity theory,”arXiv preprint arXiv:2310.17884, 2023