Transferable adversarial perturbations

Wen Zhou; Xin Hou; Yongjun Chen; Mengyun Tang; Xiangqi Huang; Xiang Gan; Yong Yang

Conference ProceedingsOPEN ACCESS

Transferable adversarial perturbations

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 11218 LNCS 471-486

DOI: 10.1007/978-3-030-01264-9_28

28Citations

127Readers

Abstract

State-of-the-art deep neural network classifiers are highly vulnerable to adversarial examples which are designed to mislead classifiers with a very small perturbation. However, the performance of black-box attacks (without knowledge of the model parameters) against deployed models always degrades significantly. In this paper, We propose a novel way of perturbations for adversarial examples to enable black-box transfer. We first show that maximizing distance between natural images and their adversarial examples in the intermediate feature maps can improve both white-box attacks (with knowledge of the model parameters) and black-box attacks. We also show that smooth regularization on adversarial perturbations enables transferring across models. Extensive experimental results show that our approach outperforms state-of-the-art methods both in white-box and black-box attacks.

Author supplied keywords

Cite

CITATION STYLE

APA

Zhou, W., Hou, X., Chen, Y., Tang, M., Huang, X., Gan, X., & Yang, Y. (2018). Transferable adversarial perturbations. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11218 LNCS, pp. 471–486). Springer Verlag. https://doi.org/10.1007/978-3-030-01264-9_28

Transferable adversarial perturbations

Abstract

Author supplied keywords

Cite

Register to see more suggestions