Detection by attack: Detecting adversarial samples by undercover attack

8Citations
Citations of this article
16Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The safety of artificial intelligence systems has aroused great concern due to the vulnerability of deep neural networks. Studies show that malicious modifications to the inputs of a network classifier, can fool the classifier and lead to wrong predictions. These modified inputs are called adversarial samples. In order to resolve this challenge, this paper proposes a novel and effective framework called Detection by Attack (DBA) to detect adversarial samples by Undercover Attack. DBA works by converting the difficult adversarial detection problem into a simpler attack problem, which is inspired by the espionage technique. It appears to be attacking the system, but it is actually defending the system. Reviewing the literature shows that this paper is the first attempt to introduce a detection method that can effectively detect adversarial samples in both images and texts. Experimental results show that the DBA scheme yields state-of-the-art detection performances in both detector-unaware ($$95.66\%$$ detection accuracy on average) and detector-aware ($$2.10\%$$ attack success rate) scenarios. Furthermore, DBA is robust to the perturbation size and confidence of adversarial samples. The code is available at https://github.com/Mrzhouqifei/DBA.

Cite

CITATION STYLE

APA

Zhou, Q., Zhang, R., Wu, B., Li, W., & Mo, T. (2020). Detection by attack: Detecting adversarial samples by undercover attack. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12309 LNCS, pp. 146–164). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-59013-0_8

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free