Countermeasure against Backdoor Attack on Neural Networks Utilizing Knowledge Distillation

  • Yoshida K
  • Fujino T
N/ACitations
Citations of this article
6Readers
Mendeley users who have this article in their library.

Abstract

A backdoor attack is a well-known security issue facing deep neural networks (DNNs). In a backdoor attack against DNNs for image classification, an adversary creates tampered data containing special marks ("poison data") and injects them into a training dataset. A DNN model that is trained with the tampered training dataset can achieve a high classification accuracy for clean (normal) input data, but the inference on the poisonous input data is misclassi-fied to the adversarial target label. In this work, we propose a countermeasure against the backdoor attack by utilizing knowledge distillation in which the DNN model user dis-tills a backdoored DNN model with clean unlabeled data. The distilled DNN model can be trained with clean knowledge on the backdoored model because the backdoor is not activated by clean data. Experimental results showed that the distilled model achieves high performance equiva­ lent to that of a clean model without a backdoor.

Cite

CITATION STYLE

APA

Yoshida, K., & Fujino, T. (2020). Countermeasure against Backdoor Attack on Neural Networks Utilizing Knowledge Distillation. Journal of Signal Processing, 24(4), 141–144. https://doi.org/10.2299/jsp.24.141

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free