Membership Inference Attacks against Machine Learning Models

Reza Shokri , Marco Stronati , Congzheng Song , Vitaly Shmatikov

Authors on Pith no claims yet

classification 💻 cs.CR cs.LGstat.ML

keywords inferencemembershipmodellearningmachinemodelstrainedattacks

read the original abstract

We quantitatively investigate how machine learning models leak information about the individual data records on which they were trained. We focus on the basic membership inference attack: given a data record and black-box access to a model, determine if the record was in the model's training dataset. To perform membership inference against a target model, we make adversarial use of machine learning and train our own inference model to recognize differences in the target model's predictions on the inputs that it trained on versus the inputs that it did not train on. We empirically evaluate our inference techniques on classification models trained by commercial "machine learning as a service" providers such as Google and Amazon. Using realistic datasets and classification tasks, including a hospital discharge dataset whose membership is sensitive from the privacy perspective, we show that these models can be vulnerable to membership inference attacks. We then investigate the factors that influence this leakage and evaluate mitigation strategies.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Machine Unlearning for Class Removal through SISA-based Deep Neural Network Architectures
cs.CV 2026-04 unverdicted novelty 5.0

A modified SISA architecture with replay and gating achieves effective class removal from trained CNNs on image datasets while preserving accuracy and cutting retraining costs.
SoK: A Comprehensive Analysis of the Current Status of Neural Tangent Generalization Attacks with Research Directions
cs.LG 2026-05 accept novelty 3.0

NTGA is the first clean-label generalization attack under black-box settings but is vulnerable to adversarial training and image transformations, with newer attacks outperforming it.