arxiv: 2604.12342 · v1 · submitted 2026-04-14 · 💻 cs.CR · cs.CV

Recognition: unknown

CoLA: A Choice Leakage Attack Framework to Expose Privacy Risks in Subset Training

Cheng-Long Wang, Di Wang, Qi Li, Yinzhi Cao

Pith reviewed 2026-05-10 15:52 UTC · model grok-4.3

classification 💻 cs.CR cs.CV

keywords privacy leakagesubset trainingmembership inferenceside-channel attacksmachine learning privacydata selectiontraining membership

0 comments

The pith

Subset training leaks private data choices through selection participation even when fewer samples are used.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Modern ML pipelines routinely train on subsets rather than full datasets to scale efficiently with little utility cost. The prevailing view holds that smaller training sets should also reduce privacy exposure. This paper shows the opposite: the act of choosing which data points to include or exclude creates new leaks about data participation. Adversaries can exploit side-channel metadata from the selection step or the trained model's outputs to infer both which points entered training and which points were merely considered during selection. The result enlarges the privacy surface from individual models to the entire data preparation supply chain.

Core claim

We introduce CoLA, a unified attack framework that defines two realistic adversary settings—subset-aware side-channel attacks and black-box attacks—and two distinct privacy surfaces: training-membership membership inference attacks (TM-MIA) that target only the final training set, and selection-participation membership inference attacks (SP-MIA) that target every sample that participated in the subset selection process. Experiments on vision and language models demonstrate that both surfaces are vulnerable, with selection participation leaking more broadly than training membership alone.

What carries the argument

CoLA framework that separates side-channel metadata attacks from black-box model-output attacks and distinguishes training-membership MIA from the broader selection-participation MIA that covers the full data-model supply chain.

If this is right

Subset training expands privacy risks from the trained model to every candidate sample considered during data selection.
Existing membership inference threat models underestimate risks because they ignore selection participation.
Both side-channel metadata and black-box model outputs can be used to recover selection information.
Privacy leakage appears in both vision and language model pipelines that rely on subset selection.
Risks extend across the broader ML ecosystem rather than remaining isolated to individual training runs.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Data filtering pipelines used in large language models may require new selection algorithms that hide participation patterns.
Provenance tracking in ML data pipelines could become a new attack vector if selection choices are logged or observable.
Defenses focused only on model outputs may leave side-channel metadata leaks unaddressed.
Subset selection methods could be redesigned to minimize distinguishability of included versus excluded samples.

Load-bearing premise

Practical adversaries can obtain enough side-channel information about the subset selection process or that model outputs alone suffice to distinguish selection participation at scale.

What would settle it

An experiment across multiple vision and language models in which no adversary achieves better than random accuracy at distinguishing selection participation using either side-channel metadata or model outputs on held-out data.

Figures

Figures reproduced from arXiv: 2604.12342 by Cheng-Long Wang, Di Wang, Qi Li, Yinzhi Cao.

**Figure 1.** Figure 1: Privacy surfaces under subset training. data generation pipelines, we turn to subset training with real data, where high-fidelity samples remain but the selection process itself exposes a distinct and overlooked channel of privacy leakage. 3 Problem Setting 3.1 Membership inference under subset training Let D0 ⊆ X × Y denote the original dataset that undergoes a subset selection procedure. A selector Se… view at source ↗

**Figure 2.** Figure 2: Signal distributions under subset training. [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: The MIA performance of different attack surface on vision models under black-box setting. underlying subset selection metadata. The inclusion stability-based pipeline of CoLA naturally unifies different attack surfaces within a single framework, thereby facilitating coordinated attacks. 5 Experiments 5.1 Setups Models and Datasets. We conduct experiments on both vision and language models. For the visio… view at source ↗

**Figure 4.** Figure 4: Ablation studies on (left) the window size and [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 7.** Figure 7: For language models, all baseline methods except [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗

**Figure 8.** Figure 8: Choice Leakage Attack (CoLA) across the data–model supply chain. CoLA augments conventional MIA by exploiting subset selection metadata leaked along the data–model supply chain. By identifying which samples are more likely to pass selection, it not only strengthens membership inference but also enables adversaries to craft tailored threats. A Appendix A.1 The Privacy Threats behind Data-Model Supply Chain … view at source ↗

read the original abstract

Training models on a carefully chosen portion of data rather than the full dataset is now a standard preprocess for modern ML. From vision coreset selection to large-scale filtering in language models, it enables scalability with minimal utility loss. A common intuition is that training on fewer samples should also reduce privacy risks. In this paper, we challenge this assumption. We show that subset training is not privacy free: the very choices of which data are included or excluded can introduce new privacy surface and leak more sensitive information. Such information can be captured by adversaries either through side-channel metadata from the subset selection process or via the outputs of the target model. To systematically study this phenomenon, we propose CoLA (Choice Leakage Attack), a unified framework for analyzing privacy leakage in subset selection. In CoLA, depending on the adversary's knowledge of the side-channel information, we define two practical attack scenarios: Subset-aware Side-channel Attacks and Black-box Attacks. Under both scenarios, we investigate two privacy surfaces unique to subset training: (1) Training-membership MIA (TM-MIA), which concerns only the privacy of training data membership, and (2) Selection-participation MIA (SP-MIA), which concerns the privacy of all samples that participated in the subset selection process. Notably, SP-MIA enlarges the notion of membership from model training to the entire data-model supply chain. Experiments on vision and language models show that existing threat models underestimate subset-training privacy risks: the expanded privacy surface leaks both training and selection membership, extending risks from individual models to the broader ML ecosystem.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Subset training adds a selection-participation privacy surface, but the black-box attack on model outputs alone still needs concrete numbers to show it works beyond standard membership inference.

read the letter

Colleague, the paper's main point is that subset training opens up new privacy leaks through the selection process itself, but the black-box attacks on model outputs for selection participation still need better evidence to hold up. The new angle here is defining selection-participation MIA on top of standard training membership. It treats the subset selection as part of the supply chain that can leak info about samples that were considered but maybe not used. The CoLA framework puts side-channel and black-box scenarios together, and they apply it to vision and language models. That framing is useful because it moves the discussion beyond just the final model. They do a solid job showing why the common intuition about fewer samples meaning less risk is incomplete, especially when selection involves metadata or side channels. The soft spot is the black-box setting. The claim that model outputs can reveal whether a sample participated in selection relies on an assumption that hasn't been backed with numbers in the abstract. If those outputs don't carry a distinct selection signal, the attack might not separate from regular membership inference. The paper mentions experiments but without success rates or baselines, it's tough to see how much this really enlarges the surface at scale. This work is for privacy researchers focused on ML data pipelines and preprocessing steps. A reader who follows membership inference literature would find the expanded threat model relevant. It should go to peer review. The idea is worth a closer look, and the authors have started the experiments, but the quantitative support needs to be front and center in revisions.

Referee Report

2 major / 1 minor

Summary. The paper introduces CoLA, a unified framework for analyzing privacy leakage from subset selection in ML training. It challenges the intuition that subset training reduces privacy risks and defines two attack scenarios (Subset-aware Side-channel Attacks and Black-box Attacks) along with two privacy surfaces: Training-membership MIA (TM-MIA) and the novel Selection-participation MIA (SP-MIA), which extends membership inference to the data selection process. The central claim is that adversaries can exploit side-channel metadata or target model outputs to infer both training membership and selection participation, with experiments on vision and language models asserted to show that existing threat models underestimate these risks.

Significance. If the empirical results hold, the work is significant for identifying an expanded privacy surface in common ML pipelines that rely on data subsetting (e.g., coresets, filtering). By formalizing SP-MIA and providing a framework that distinguishes side-channel vs. black-box access, it could influence privacy analysis beyond individual models to the broader data curation ecosystem. The empirical focus on both vision and language models is a strength, though the absence of reported metrics in the abstract limits immediate assessment of practical impact.

major comments (2)

[Abstract] Abstract: the claim that 'experiments on vision and language models show that existing threat models underestimate subset-training privacy risks' is load-bearing for the central thesis, yet the abstract supplies no attack success rates, baselines, ablation results, or quantitative comparison between TM-MIA and SP-MIA. Without these, it is impossible to verify whether model outputs alone suffice to distinguish selection participation at scale rather than reducing to standard membership inference.
[Abstract] The weakest assumption (that black-box adversaries can reliably infer SP-MIA from target model outputs without side-channel metadata) is not supported by any reported evidence in the provided abstract. The paper must include concrete success rates, ROC curves, or comparison tables showing that selection signals are distinguishable from ordinary membership signals; otherwise the enlargement of the privacy surface remains unproven.

minor comments (1)

[Abstract] The abstract would be clearer if it briefly named the specific vision and language models/datasets used and the scale of the experiments (e.g., number of samples, subset sizes).

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thoughtful and detailed comments. We agree that the abstract should be strengthened with concrete quantitative results to better support the central claims. We address each major comment below and will revise the manuscript accordingly.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that 'experiments on vision and language models show that existing threat models underestimate subset-training privacy risks' is load-bearing for the central thesis, yet the abstract supplies no attack success rates, baselines, ablation results, or quantitative comparison between TM-MIA and SP-MIA. Without these, it is impossible to verify whether model outputs alone suffice to distinguish selection participation at scale rather than reducing to standard membership inference.

Authors: We appreciate this observation. The full manuscript (Sections 4 and 5) reports attack success rates, AUC values, ROC curves, and direct comparisons between TM-MIA and SP-MIA under both side-channel and black-box settings, including ablations across vision and language models. These results demonstrate that SP-MIA achieves non-trivial success beyond standard membership inference. To address the concern, we will revise the abstract to include representative quantitative metrics (e.g., AUC improvements and success rates) that summarize the key empirical findings. revision: yes
Referee: [Abstract] The weakest assumption (that black-box adversaries can reliably infer SP-MIA from target model outputs without side-channel metadata) is not supported by any reported evidence in the provided abstract. The paper must include concrete success rates, ROC curves, or comparison tables showing that selection signals are distinguishable from ordinary membership signals; otherwise the enlargement of the privacy surface remains unproven.

Authors: We agree that the abstract should explicitly substantiate this point. The manuscript provides evidence through black-box experiments showing that model outputs alone yield SP-MIA performance distinguishable from TM-MIA (e.g., via higher AUC and accuracy over random baselines, with tables comparing the two). We will update the abstract to report these concrete metrics and comparisons, clarifying that the enlargement of the privacy surface is supported by the empirical distinction between selection participation and training membership signals. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical attack framework with no derivations or self-referential reductions

full rationale

The paper presents CoLA as an empirical attack study and framework for subset-training privacy risks, defining TM-MIA and SP-MIA surfaces plus two adversary scenarios (side-channel and black-box) without any equations, fitted parameters, predictions, or derivation chains. Claims rest on experimental results on vision/language models rather than reducing to inputs by construction, self-citations, or ansatzes. No load-bearing steps match the enumerated circularity patterns; the work is self-contained as an attack proposal evaluated externally.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The abstract relies on standard assumptions from membership inference literature (adversary access to model outputs or side channels) without introducing new free parameters or invented entities.

axioms (1)

domain assumption Adversaries can obtain side-channel metadata from subset selection or query model outputs
Invoked when defining Subset-aware Side-channel Attacks and Black-box Attacks

pith-pipeline@v0.9.0 · 5594 in / 1171 out tokens · 24504 ms · 2026-05-10T15:52:28.906236+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

On-Policy Self-Evolution via Failure Trajectories for Agentic Safety Alignment
cs.AI 2026-05 unverdicted novelty 6.0

FATE lets LLM agents self-evolve safer behaviors by generating and filtering repairs from their own failure trajectories using verifiers and Pareto optimization.

Reference graph

Works this paper leans on

2 extracted references · 2 canonical work pages · cited by 1 Pith paper · 1 internal anchor

[1]

Do membership inference attacks work on large language models?arXiv preprint arXiv:2402.07841, 2024

Coresets for nonparametric estimation-the case of dp-means. InInternational Conference on Machine Learning, pages 209–217. PMLR. Stella Biderman, Hailey Schoelkopf, Quentin Gregory Anthony, Herbie Bradley, Kyle O’Brien, Eric Hal- lahan, Mohammad Aflah Khan, Shivanshu Purohit, USVSN Sai Prashanth, Edward Raff, Aviya Skowron, Lintang Sutawika, and Oskar van...

work page arXiv 2023
[2]

Scaling Language Models: Methods, Analysis & Insights from Training Gopher

Coresets for data-efficient training of machine learning models. InInternational Conference on Machine Learning, pages 6950–6960. PMLR. Alexander Munteanu, Chris Schwiegelshohn, Christian Sohler, and David Woodruff. 2018. On coresets for logistic regression.Advances in Neural Information Processing Systems, 31. Milad Nasr, Reza Shokri, and Amir Houmansadr...

work page internal anchor Pith review arXiv 2018