Detecting adversarial examples through image transformation

64Citations
Citations of this article
58Readers
Mendeley users who have this article in their library.

Abstract

Deep Neural Networks (DNNs) have demonstrated remarkable performance in a diverse range of applications. Along with the prevalence of deep learning, it has been revealed that DNNs are vulnerable to attacks. By deliberately crafting adversarial examples, an adversary can manipulate a DNN to generate incorrect outputs, which may lead catastrophic consequences in applications such as disease diagnosis and self-driving cars. In this paper, we propose an effective method to detect adversarial examples in image classification. Our key insight is that adversarial examples are usually sensitive to certain image transformation operations such as rotation and shifting. In contrast, a normal image is generally immune to such operations. We implement this idea of image transformation and evaluate its performance in oblivious attacks. Our experiments with two datasets show that our technique can detect nearly 99% of adversarial examples generated by the state-of-the-art algorithm. In addition to oblivious attacks, we consider the case of white-box attacks. We propose to introduce randomness in the process of image transformation, which can achieve a detection ratio of around 70%.

Cite

CITATION STYLE

APA

Tian, S., Yang, G., & Cai, Y. (2018). Detecting adversarial examples through image transformation. In 32nd AAAI Conference on Artificial Intelligence, AAAI 2018 (pp. 4139–4146). AAAI press. https://doi.org/10.1609/aaai.v32i1.11828

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free