Deep learning architectures are vulnerable to adversarial perturbations. They are added to the input and alter drastically the output of deep networks. These instances are called adversarial examples. They are observed in various learning tasks from supervised learning to unsupervised and reinforcement learning. In this chapter, we review some of the most important highlights in theory and practice of adversarial examples. The focus is on designing adversarial attacks, theoretical investigation into the nature of adversarial examples, and establishing defenses against adversarial attacks. A common thread in the design of adversarial attacks is the perturbation analysis of learning algorithms. Many existing algorithms rely implicitly on perturbation analysis for generating adversarial examples. The summary of most powerful attacks are presented in this light. We overview various theories behind the existence of adversarial examples as well as theories that consider the relation between the generalization error and adversarial robustness. Finally, various defenses against adversarial examples are also discussed.
CITATION STYLE
Balda, E. R., Behboodi, A., & Mathar, R. (2020). Adversarial examples in deep neural networks: An overview. In Studies in Computational Intelligence (Vol. 865, pp. 31–65). Springer Verlag. https://doi.org/10.1007/978-3-030-31760-7_2
Mendeley helps you to discover research relevant for your work.