Deep learning models are vulnerable to adversarial examples. Most of current adversarial attacks add pixel-wise perturbations restricted to some Lp-norm, and defense models are evaluated also on adversarial examples restricted inside Lp-norm balls. However, we wish to explore adversarial examples exist beyond Lp-norm balls and their implications for attacks and defenses. In this paper, we focus on adversarial images generated by transformations. We start with color transformation and propose two gradient-based attacks. Since Lp-norm is inappropriate for measuring image quality in the transformation space, we use the similarity between transformations and the Structural Similarity Index. Next, we explore a larger transformation space consisting of combinations of color and affine transformations. We evaluate our transformation attacks on three data sets - CIFAR10, SVHN, and ImageNet - and their corresponding models. Finally, we perform retraining defenses to evaluate the strength of our attacks. The results show that transformation attacks are powerful. They find high-quality adversarial images that have higher transferability and misclassification rates than C&W's Lp attacks, especially at high confidence levels. They are also significantly harder to defend against by retraining than C&W's Lp attacks. More importantly, exploring different attack spaces makes it more challenging to train a universally robust model.
CITATION STYLE
Chen, J., Wang, D., & Chen, H. (2020). Explore the Transformation Space for Adversarial Images. In CODASPY 2020 - Proceedings of the 10th ACM Conference on Data and Application Security and Privacy (pp. 109–120). Association for Computing Machinery, Inc. https://doi.org/10.1145/3374664.3375728
Mendeley helps you to discover research relevant for your work.