Stealing Machine Learning Models: Attacks and Countermeasures for Generative Adversarial Networks

Hailong Hu; Jun Pang

Conference ProceedingsOPEN ACCESS

Stealing Machine Learning Models: Attacks and Countermeasures for Generative Adversarial Networks

ACM International Conference Proceeding Series (2021) 1-16

DOI: 10.1145/3485832.3485838

34Citations

32Readers

Abstract

Model extraction attacks aim to duplicate a machine learning model through query access to a target model. Early studies mainly focus on discriminative models. Despite the success, model extraction attacks against generative models are less well explored. In this paper, we systematically study the feasibility of model extraction attacks against generative adversarial networks (GANs). Specifically, we first define fidelity and accuracy on model extraction attacks against GANs. Then we study model extraction attacks against GANs from the perspective of fidelity extraction and accuracy extraction, according to the adversary's goals and background knowledge. We further conduct a case study where the adversary can transfer knowledge of the extracted model which steals a state-of-the-art GAN trained with more than 3 million images to new domains to broaden the scope of applications of model extraction attacks. Finally, we propose effective defense techniques to safeguard GANs, considering a trade-off between the utility and security of GAN models.

Author supplied keywords

Cite

CITATION STYLE

APA

Hu, H., & Pang, J. (2021). Stealing Machine Learning Models: Attacks and Countermeasures for Generative Adversarial Networks. In ACM International Conference Proceeding Series (pp. 1–16). Association for Computing Machinery. https://doi.org/10.1145/3485832.3485838

Stealing Machine Learning Models: Attacks and Countermeasures for Generative Adversarial Networks

Abstract

Author supplied keywords

Cite

Register to see more suggestions