Membership Inference Attacks and Defenses in Classification Models

83Citations
Citations of this article
41Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We study the membership inference (MI) attack against classifiers, where the attacker's goal is to determine whether a data instance was used for training the classifier. Through systematic cataloging of existing MI attacks and extensive experimental evaluations of them, we find that a model's vulnerability to MI attacks is tightly related to the generalization gap - -the difference between training accuracy and test accuracy. We then propose a defense against MI attacks that aims to close the gap by intentionally reduces the training accuracy. More specifically, the training process attempts to match the training and validation accuracies, by means of a new set regularizer using the Maximum Mean Discrepancy between the softmax output empirical distributions of the training and validation sets. Our experimental results show that combining this approach with another simple defense (mix-up training) significantly improves state-of-the-art defense against MI attacks, with minimal impact on testing accuracy.

Cite

CITATION STYLE

APA

Li, J., Li, N., & Ribeiro, B. (2021). Membership Inference Attacks and Defenses in Classification Models. In CODASPY 2021 - Proceedings of the 11th ACM Conference on Data and Application Security and Privacy (pp. 5–16). Association for Computing Machinery, Inc. https://doi.org/10.1145/3422337.3447836

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free