Generating fluent adversarial examples for natural languages

Huangzhao Zhang; Hao Zhou; Ning Miao; Lei Li

Conference ProceedingsOPEN ACCESS

Generating fluent adversarial examples for natural languages

ACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (2020) 5564-5569

DOI: 10.18653/v1/p19-1559

97Citations

211Readers

Abstract

Efficiently building an adversarial attacker for natural language processing (NLP) tasks is a real challenge. Firstly, as the sentence space is discrete, it is difficult to make small perturbations along the direction of gradients. Secondly, the fluency of the generated examples cannot be guaranteed. In this paper, we propose MHA, which addresses both problems by performing Metropolis-Hastings sampling, whose proposal is designed with the guidance of gradients. Experiments on IMDB and SNLI show that our proposed MHA outperforms the baseline model on attacking capability. Adversarial training with MHA also leads to better robustness and performance.

Cite

CITATION STYLE

APA

Zhang, H., Zhou, H., Miao, N., & Li, L. (2020). Generating fluent adversarial examples for natural languages. In ACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (pp. 5564–5569). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/p19-1559

Generating fluent adversarial examples for natural languages

Abstract

Cite

Register to see more suggestions