Stealing Machine Learning Models: Attacks and Countermeasures for Generative Adversarial Networks

34Citations
Citations of this article
32Readers
Mendeley users who have this article in their library.

Abstract

Model extraction attacks aim to duplicate a machine learning model through query access to a target model. Early studies mainly focus on discriminative models. Despite the success, model extraction attacks against generative models are less well explored. In this paper, we systematically study the feasibility of model extraction attacks against generative adversarial networks (GANs). Specifically, we first define fidelity and accuracy on model extraction attacks against GANs. Then we study model extraction attacks against GANs from the perspective of fidelity extraction and accuracy extraction, according to the adversary's goals and background knowledge. We further conduct a case study where the adversary can transfer knowledge of the extracted model which steals a state-of-the-art GAN trained with more than 3 million images to new domains to broaden the scope of applications of model extraction attacks. Finally, we propose effective defense techniques to safeguard GANs, considering a trade-off between the utility and security of GAN models.

Cite

CITATION STYLE

APA

Hu, H., & Pang, J. (2021). Stealing Machine Learning Models: Attacks and Countermeasures for Generative Adversarial Networks. In ACM International Conference Proceeding Series (pp. 1–16). Association for Computing Machinery. https://doi.org/10.1145/3485832.3485838

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free