Generating Adversarial Texts for Recurrent Neural Networks

Chang Liu; Wang Lin; Zhengfeng Yang

Conference Proceedings

Generating Adversarial Texts for Recurrent Neural Networks

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2020) 12396 LNCS 39-51

DOI: 10.1007/978-3-030-61609-0_4

1Citations

6Readers

Get full text

Abstract

Adversarial examples have received increasing attention recently due to their significant values in evaluating and improving the robustness of deep neural networks. Existing adversarial attack algorithms have achieved good result for most images. However, those algorithms cannot be directly applied to texts as the text data is discrete in nature. In this paper, we extend two state-of-the-art attack algorithms, PGD and C&W, to craft adversarial text examples for RNN-based models. For Extend-PGD attack, it identifies the words that are important for classification by computing the Jacobian matrix of the classifier, to effectively generate adversarial text examples. For Extend-C&W attack, it utilizes regularization to minimize the alteration of the original input text. We conduct comparison experiments on two recurrent neural networks trained for classifying texts in two real-world datasets. Experimental results show that our Extend-PGD and Extend-C&W attack algorithms have advantages of attack success rate and semantics-preserving ability, respectively.

Author supplied keywords

Cite

CITATION STYLE

APA

Liu, C., Lin, W., & Yang, Z. (2020). Generating Adversarial Texts for Recurrent Neural Networks. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12396 LNCS, pp. 39–51). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-61609-0_4

Generating Adversarial Texts for Recurrent Neural Networks

Abstract

Author supplied keywords

Cite

Register to see more suggestions