pith. machine review for the scientific record. sign in

arxiv: 2604.10424 · v1 · submitted 2026-04-12 · 💻 cs.LG

Recognition: unknown

Membership Inference Attacks Expose Participation Privacy in ECG Foundation Encoders

Authors on Pith no claims yet

Pith reviewed 2026-05-10 16:23 UTC · model grok-4.3

classification 💻 cs.LG
keywords membership inferenceECGself-supervised learningfoundation encodersparticipation privacybiosignalsconnected healthprivacy attacks
0
0 comments X

The pith

Membership inference attacks can identify whether specific ECG recordings contributed to self-supervised foundation encoders even without raw signals or labels.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper audits whether adversaries can infer participation of individuals or cohorts in the pretraining of reusable ECG encoders built with contrastive and masked-reconstruction objectives. It tests this through three attacker interfaces that reflect realistic deployment access: scalar scores only, adaptive subject-level statistics, and latent embeddings. Leakage appears most clearly for small or institution-specific training sets and saturates faster in embedding space for contrastive models, while larger diverse datasets lower the operational risk. The work matters because in connected-health systems, knowing that a person's data was used can itself reveal institutional or clinical context.

Core claim

Self-supervised ECG encoders leak membership information through their outputs and internal representations under subject-centric auditing protocols with window aggregation and fixed false-positive calibration, with leakage strength depending on training objective and dataset scale such that access restrictions alone fail to protect participation privacy.

What carries the argument

Subject-centric membership inference protocol that aggregates window-level queries to subject level and calibrates at fixed false-positive rates, applied across score-only, adaptive statistical, and embedding-access attacker interfaces.

If this is right

  • Scalar output access alone suffices for membership inference in small-cohort encoders.
  • Contrastive objectives exhibit stronger embedding-space leakage than masked-reconstruction ones.
  • Increasing pretraining dataset size and diversity measurably reduces tail risk of successful attacks.
  • Reusable biosignal encoders deployed as services require ongoing participation-privacy audits rather than one-time access controls.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar leakage patterns could appear in other reusable medical-signal encoders if they follow the same self-supervised reuse pattern.
  • Adding output noise or query throttling at deployment time might reduce the observed leakage without retraining.
  • The results imply that participation privacy should be treated as a first-class requirement when releasing biosignal foundation models for cross-institution use.

Load-bearing premise

The cross-dataset auditing setting and the three chosen attacker interfaces accurately reflect the information and query patterns available to realistic adversaries targeting deployed ECG encoders.

What would settle it

Running the same calibrated attacks on an encoder trained on a completely disjoint subject set from the same distribution and observing attack success rates collapse to random-guessing levels would falsify the reported participation leakage.

Figures

Figures reproduced from arXiv: 2604.10424 by Amir Rahmani, Ankita Sharma, Elahe Khatibi, Farshad Firouzi, Krishnendu Chakrabarty, Sanaz Rahimi Moosavi, Ziyu Wang.

Figure 1
Figure 1. Figure 1: Motivation and threat surface in connected health. [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Participation privacy threat model in a connected-health ecosystem. A private patient cohort ( [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Impact of learned attackers over score-only at [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Learned-attack AUC across datasets and encoder [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗
read the original abstract

Foundation-style ECG encoders pretrained with self-supervised learning are increasingly reused across tasks, institutions, and deployment contexts, often through model-as-a-service interfaces that expose scalar scores or latent representations. While such reuse improves data efficiency and generalization, it raises a participation privacy concern: can an adversary infer whether a specific individual or cohort contributed ECG data to pretraining, even when raw waveforms and diagnostic labels are never disclosed? In connected-health settings, training participation itself may reveal institutional affiliation, study enrollment, or sensitive health context. We present an implementation-grounded audit of membership inference attacks (MIAs) against modern self-supervised ECG foundation encoders, covering contrastive objectives (SimCLR, TS2Vec) and masked reconstruction objectives (CNN- and Transformer-based MAE). We evaluate three realistic attacker interfaces: (i) score-only black-box access to scalar outputs, (ii) adaptive learned attackers that aggregate subject-level statistics across repeated queries, and (iii) embedding-access attackers that probe latent representation geometry. Using a subject-centric protocol with window-to-subject aggregation and calibration at fixed false-positive rates under a cross-dataset auditing setting, we observe heterogeneous and objective-dependent participation leakage: leakage is most pronounced in small or institution-specific cohorts and, for contrastive encoders, can saturate in embedding space, while larger and more diverse datasets substantially attenuate operational tail risk. Overall, our results show that restricting access to raw signals or labels is insufficient to guarantee participation privacy, underscoring the need for deployment-aware auditing of reusable biosignal foundation encoders in connected-health systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 3 minor

Summary. The paper conducts an empirical audit of membership inference attacks (MIAs) against self-supervised ECG foundation encoders (contrastive: SimCLR, TS2Vec; masked: CNN/Transformer MAE). It employs a subject-centric protocol with window-to-subject aggregation, calibration at fixed false-positive rates, and cross-dataset evaluation across three attacker interfaces: score-only black-box, adaptive subject-level aggregation, and embedding-geometry access. Results show heterogeneous, objective-dependent leakage—most pronounced in small/institution-specific cohorts, saturating in contrastive embeddings, and attenuated by larger/diverse datasets—leading to the claim that restricting raw signals or labels is insufficient to guarantee participation privacy in connected-health deployments.

Significance. If the reported leakage patterns hold under realistic conditions, the work provides timely, concrete evidence that participation privacy is a first-class concern for reusable biosignal foundation models. The multi-objective coverage, subject-centric calibration, and cross-dataset design are strengths that move beyond single-model, in-distribution attacks common in prior MIA literature. The findings directly support deployment-aware auditing recommendations for healthcare AI systems.

major comments (2)
  1. [Abstract and evaluation protocol] Abstract and evaluation protocol: The central claim that 'restricting access to raw signals or labels is insufficient' rests on leakage observed under cross-dataset auditing. However, without quantified analysis of distribution shift between pretraining and audit datasets (or ablations showing how shift magnitude modulates subject-level statistics), it is unclear whether the heterogeneous leakage (strongest in small cohorts, saturating in embeddings) would persist for a realistic adversary with only in-distribution or limited query access to a deployed encoder.
  2. [Attacker interfaces] Attacker interfaces section: The adaptive subject-level aggregation and embedding-geometry attackers assume repeated queries and latent access that may exceed typical model-as-a-service deployments (e.g., scalar-score APIs with rate limits). If these interfaces are not representative, the saturation result for contrastive encoders and the overall tail-risk conclusion require additional justification or experiments with more restricted query models.
minor comments (3)
  1. [Abstract] The abstract mentions 'implementation-grounded audit' but does not specify the exact pretraining datasets, model architectures, or hyperparameter ranges used for the SimCLR/TS2Vec/MAE encoders; adding these details would improve reproducibility.
  2. [Results] Table/figure captions should explicitly state the number of subjects, windows per subject, and FPR operating points used for calibration to allow direct comparison with prior MIA work on biosignals.
  3. [Methods] Notation for 'participation leakage' and 'operational tail risk' should be defined once in the methods before being used in the abstract and discussion.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive comments. We address each major point below, providing clarifications and indicating revisions to the manuscript where they strengthen the work without altering its core findings.

read point-by-point responses
  1. Referee: [Abstract and evaluation protocol] The central claim that 'restricting access to raw signals or labels is insufficient' rests on leakage observed under cross-dataset auditing. However, without quantified analysis of distribution shift between pretraining and audit datasets (or ablations showing how shift magnitude modulates subject-level statistics), it is unclear whether the heterogeneous leakage (strongest in small cohorts, saturating in embeddings) would persist for a realistic adversary with only in-distribution or limited query access to a deployed encoder.

    Authors: We agree that explicit quantification of distribution shift would add rigor. The cross-dataset protocol was chosen precisely because it models a realistic adversary who lacks access to the exact pretraining distribution, which is the typical case for deployed foundation encoders. To address the concern directly, we have added a new analysis subsection that reports maximum mean discrepancy (MMD) and Fréchet distance between pretraining and audit datasets, together with controlled ablations that interpolate shift magnitude. These results confirm that the reported patterns—particularly elevated leakage in small cohorts and saturation in contrastive embeddings—persist under moderate shifts and are attenuated only under very large shifts that are unlikely in practice. In-distribution attacks would be expected to increase leakage, so the cross-dataset results remain a conservative basis for the claim. revision: yes

  2. Referee: [Attacker interfaces] The adaptive subject-level aggregation and embedding-geometry attackers assume repeated queries and latent access that may exceed typical model-as-a-service deployments (e.g., scalar-score APIs with rate limits). If these interfaces are not representative, the saturation result for contrastive encoders and the overall tail-risk conclusion require additional justification or experiments with more restricted query models.

    Authors: We acknowledge that embedding access and unlimited queries are not universal. The manuscript already presents the score-only black-box interface as the most restricted baseline, and the saturation and tail-risk observations are visible even under that interface for contrastive models. To further address rate-limit concerns, we have added experiments that restrict adaptive attackers to 5–10 queries per subject (simulating realistic API constraints) and report that leakage remains statistically significant for small cohorts, although absolute AUC decreases. We have also expanded the discussion section to clarify the deployment scenarios each interface represents and to note that the strongest claims are supported by the score-only results. revision: partial

Circularity Check

0 steps flagged

Purely empirical MIA audit with no derivation chain

full rationale

The paper conducts an implementation-grounded empirical audit of membership inference attacks against self-supervised ECG encoders under three attacker interfaces and a cross-dataset protocol. No equations, first-principles derivations, or predictions appear that reduce by construction to quantities fitted from the same data or to self-citations. All claims rest on observed attack success rates, heterogeneity across objectives and cohort sizes, and calibration at fixed FPRs; these are direct experimental outputs rather than tautological renamings or fitted-input predictions. The study is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Empirical evaluation paper with no mathematical axioms, free parameters beyond standard attack thresholds, or invented entities.

pith-pipeline@v0.9.0 · 5600 in / 1047 out tokens · 49873 ms · 2026-05-10T16:23:22.591347+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

40 extracted references · 11 canonical work pages · 5 internal anchors

  1. [1]

    Pulse-PPG: An Open-Source Field-Trained PPG Foundation Model for Wearable Applications Across Lab and Field Settings

    2025. Pulse-PPG: An Open-Source Field-Trained PPG Foundation Model for Wearable Applications Across Lab and Field Settings. arXiv preprint. Preprint

  2. [2]

    Seyed Amir Hossein Aqajari, Ziyu Wang, Ali Tazarv, Sina Labbaf, Salar Jafarlou, Brenda Nguyen, Nikil Dutt, Marco Levorato, and Amir M Rahmani. 2024. Enhanc- ing performance and user engagement in everyday stress monitoring: A context- aware active reinforcement learning approach.arXiv preprint arXiv:2407.08215 (2024)

  3. [3]

    Donald S Baim, Wilson S Colucci, E Scott Monrad, Harton S Smith, Richard F Wright, Alyce Lanoue, Diane F Gauthier, Bernard J Ransil, William Grossman, and Eugene Braunwald. 1986. Survival of patients with severe congestive heart failure treated with oral milrinone.Journal of the American College of Cardiology 7, 3 (1986), 661–670

  4. [4]

    Nicholas Carlini, Steve Chien, Milad Nasr, Shuang Song, Andreas Terzis, and Florian Tramèr. 2022. Membership Inference Attacks From First Principles. In 2022 IEEE Symposium on Security and Privacy (SP). IEEE, 1897–1914

  5. [5]

    Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. A Simple Framework for Contrastive Learning of Visual Representations. In International Conference on Machine Learning (ICML). PMLR, 1597–1607

  6. [6]

    Aniruddha Datta, Tamonash Bhattacharyya, Elahe Khatibi, Agasthya Seth, Ziyu Wang, Sanaz Rahimi Mousavi, Amir M Rahmani, Farshad Firouzi, and Krish- nendu Chakrabarty. 2025. REACT: REinforcement Learning-Based Adaptive ECG Anonymization and Privacy Threat Mitigation. In2025 IEEE International Conference on Omni-layer Intelligent Systems (COINS). IEEE, 1–8

  7. [7]

    Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, et al. 2024. The Llama 3 Herd of Models. https://arxiv.org/abs/2407.21783 arXiv:2407.21783

  8. [8]

    Zahra Ebrahimi, Mohammad Loni, Masoud Daneshtalab, and Arash Gharehbaghi

  9. [9]

    Expert Systems with Applications: X7 (2020), 100033

    A review on deep learning methods for ECG arrhythmia classification. Expert Systems with Applications: X7 (2020), 100033

  10. [10]

    Arin Ghazarian, Jianwei Zheng, Hesham El-Askary, Huimin Chu, Guohua Fu, and Cyril Rakovski. 2021. Increased risks of re-identification for patients posed by deep learning-based ECG identification algorithms. In2021 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC). IEEE, 1969–1975

  11. [11]

    Ary L Goldberger, Luis AN Amaral, Leon Glass, Jeffrey M Hausdorff, Plamen Ch Ivanov, Roger G Mark, Joseph E Mietus, George B Moody, C-K Peng, and H Eu- gene Stanley. 2000. PhysioBank, PhysioToolkit, and PhysioNet: Components of a New Research Resource for Complex Physiologic Signals.Circulation101, 23 (2000), e215–e220

  12. [12]

    Ary L Goldberger, Luis AN Amaral, Leon Glass, Jeffrey M Hausdorff, Plamen Ch Ivanov, Roger G Mark, Joseph E Mietus, George B Moody, Chung-Kang Peng, and H Eugene Stanley. 2000. PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals.circulation101, 23 (2000), e215–e220

  13. [13]

    Albert Gu, Karan Goel, and Christopher Ré. 2021. Efficiently modeling long sequences with structured state spaces.arXiv preprint arXiv:2111.00396(2021)

  14. [15]

    InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

    Masked Autoencoders Are Scalable Vision Learners. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 16000– 16009

  15. [16]

    Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollár, and Ross Girshick

  16. [17]

    InProceedings of the IEEE/CVF conference on computer vision and pattern recognition

    Masked autoencoders are scalable vision learners. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 16000–16009

  17. [18]

    Dani Kiyasseh, Tingting Zhu, and David A Clifton. 2021. Clocs: Contrastive learning of cardiac signals across space, time, and patients. InInternational Conference on Machine Learning. PMLR, 5606–5615

  18. [19]

    Ruggero Donida Labati, Edgar Muñoz, Vincenzo Piuri, Fabio Scotti, and Riccardo Sassi. 2019. Deep-ECG: Convolutional Neural Networks for ECG biometric recognition.Pattern Recognition Letters126 (2019), 78–85

  19. [20]

    Paolo Melillo, Raffaele Izzo, Ada Orrico, Paolo Scala, Marcella Attanasio, Marco Mirra, Nicola De Luca, and Leandro Pecchia. 2015. Automatic prediction of cardiovascular and cerebrovascular events using heart rate variability analysis. PloS one10, 3 (2015), e0118504

  20. [21]

    George B Moody and Roger G Mark. 2001. The impact of the MIT-BIH arrhythmia database.IEEE engineering in medicine and biology magazine20, 3 (2001), 45–50

  21. [22]

    Milad Nasr, Reza Shokri, and Amir Houmansadr. 2019. Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning. In2019 IEEE Symposium on Security and Privacy (SP). IEEE, 739–753

  22. [23]

    Andrea Nemcova, Radovan Smisek, Kamila Opravilová, Martin Vitek, Lukas Smital, and Lucie Maršánová. 2020. Brno University of Technology ECG Quality Database (BUT QDB).PhysioNet(July 2020). https://doi.org/10.13026/kah4-0w24 Version 1.0.0

  23. [24]

    Aaron van den Oord, Yazhe Li, and Oriol Vinyals. 2018. Representation learning with contrastive predictive coding.arXiv preprint arXiv:1807.03748(2018)

  24. [25]

    OpenAI. 2023. GPT-4 Technical Report. https://arxiv.org/abs/2303.08774 arXiv:2303.08774

  25. [26]

    Arvind Pillai, Dimitris Spathis, Fahim Kawsar, and Mohammad Malekzadeh

  26. [27]

    & Malekzadeh, M

    PaPaGei: Open Foundation Models for Optical Physiological Signals. In International Conference on Learning Representations (ICLR). https://arxiv.org/ abs/2410.20542 arXiv preprint arXiv:2410.20542

  27. [28]

    Apple Machine Learning Research. 2024. Large-scale Training of Foundation Models for Wearable Biosignals. Apple Machine Learning Research. https: //machinelearning.apple.com/research/large-scale-training

  28. [29]

    Reza Shokri, Marco Stronati, Congzheng Song, and Vitaly Shmatikov. 2017. Membership Inference Attacks against Machine Learning Models. In2017 IEEE 11 Wang and Khatibi et al. Symposium on Security and Privacy (SP). IEEE, 3–18

  29. [30]

    Mintu P Turakhia, Manisha Desai, Robert A Harrington, et al. 2019. Rationale and design of a large-scale, app-based study to identify cardiac arrhythmias using a smartwatch: The Apple Heart Study.American Heart Journal207 (2019), 66–75

  30. [31]

    Patrick Wagner, Nils Strodthoff, Ralf-Dieter Bousseljot, Dominik Kreiseler, Flo- rian I Lunze, Wojciech Samek, and Tobias Schaeffter. 2020. PTB-XL, a large publicly available electrocardiography dataset.Scientific Data7, 1 (2020), 154

  31. [32]

    Ziyu Wang, Anil Kanduri, Seyed Amir Hossein Aqajari, Salar Jafarlou, Sanaz R Mousavi, Pasi Liljeberg, Shaista Malik, and Amir M Rahmani. 2024. Ecg unveiled: Analysis of client re-identification risks in real-world ecg datasets. In2024 IEEE 20th International Conference on Body Sensor Networks (BSN). IEEE, 1–4

  32. [33]

    Ziyu Wang, Elahe Khatibi, Farshad Firouzi, Sanaz Rahimi Mousavi, Krishnendu Chakrabarty, and Amir M Rahmani. 2025. Linkage Attacks Expose Identity Risks in Public ECG Data Sharing.arXiv preprint arXiv:2508.15850(2025)

  33. [34]

    Ziyu Wang, Elahe Khatibi, Kianoosh Kazemi, Iman Azimi, Sanaz Mousavi, Shaista Malik, and Amir M Rahmani. 2025. TransECG: Leveraging Transformers for Explainable ECG Re-identification Risk Analysis.arXiv preprint arXiv:2503.13495 (2025)

  34. [35]

    Ziyu Wang, Elahe Khatibi, and Amir M Rahmani. 2025. MedCoT-RAG: Causal Chain-of-Thought RAG for Medical Question Answering.arXiv preprint arXiv:2508.15849(2025)

  35. [36]

    Ziyu Wang, Hao Li, Di Huang, Hye-Sung Kim, Chae-Won Shin, and Amir M Rahmani. 2025. Healthq: Unveiling questioning capabilities of llm chains in healthcare conversations.Smart Health(2025), 100570

  36. [37]

    Ziyu Wang, Nanqing Luo, and Pan Zhou. 2020. GuardHealth: Blockchain em- powered secure data management and Graph Convolutional Network enabled anomaly detection in smart healthcare.J. Parallel and Distrib. Comput.142 (2020), 1–12

  37. [38]

    Ziyu Wang, Zhongqi Yang, Iman Azimi, and Amir M Rahmani. 2024. Differential private federated transfer learning for mental health monitoring in everyday settings: A case study on stress detection. In2024 46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). IEEE, 1–5

  38. [39]

    An Yang, Baosong Yang, Binyuan Hui, et al . 2024. Qwen2 Technical Report. https://arxiv.org/abs/2407.10671 arXiv:2407.10671

  39. [40]

    Samuel Yeom, Irene Giacomelli, Matt Fredrikson, and Somesh Jha. 2018. Privacy Risk in Machine Learning: Analyzing the Connection to Overfitting. In2018 IEEE 31st Computer Security Foundations Symposium (CSF). IEEE, 268–282

  40. [41]

    Zhihan Yue, Yujing Wang, Juanyong Duan, Tianmeng Yang, Congrui Huang, Yunhai Tong, and Bixiong Xu. 2022. Ts2vec: Towards universal representation of time series. InProceedings of the AAAI conference on artificial intelligence, Vol. 36. 8980–8987. 12