NIC: Detecting Adversarial Samples with Neural Network Invariant Checking

Shiqing Ma; Yingqi Liu; Guanhong Tao; Wen Chuan Lee; Xiangyu Zhang

Conference ProceedingsOPEN ACCESS

NIC: Detecting Adversarial Samples with Neural Network Invariant Checking

26th Annual Network and Distributed System Security Symposium, NDSS 2019 (2019)

DOI: 10.14722/ndss.2019.23415

162Citations

183Readers

Abstract

Deep Neural Networks (DNN) are vulnerable to adversarial samples that are generated by perturbing correctly classified inputs to cause DNN models to misbehave (e.g., misclassification). This can potentially lead to disastrous consequences especially in security-sensitive applications. Existing defense and detection techniques work well for specific attacks under various assumptions (e.g., the set of possible attacks are known beforehand). However, they are not sufficiently general to protect against a broader range of attacks. In this paper, we analyze the internals of DNN models under various attacks and identify two common exploitation channels: the provenance channel and the activation value distribution channel. We then propose a novel technique to extract DNN invariants and use them to perform runtime adversarial sample detection. Our experimental results of 11 different kinds of attacks on popular datasets including ImageNet and 13 models show that our technique can effectively detect all these attacks (over 90% accuracy) with limited false positives. We also compare it with three state-of-the-art techniques including the Local Intrinsic Dimensionality (LID) based method, denoiser based methods (i.e., MagNet and HGD), and the prediction inconsistency based approach (i.e., feature squeezing). Our experiments show promising results.

Cite

CITATION STYLE

APA

Ma, S., Liu, Y., Tao, G., Lee, W. C., & Zhang, X. (2019). NIC: Detecting Adversarial Samples with Neural Network Invariant Checking. In 26th Annual Network and Distributed System Security Symposium, NDSS 2019. The Internet Society. https://doi.org/10.14722/ndss.2019.23415

NIC: Detecting Adversarial Samples with Neural Network Invariant Checking

Abstract

Cite

Register to see more suggestions