RMLM: A Flexible Defense Framework for Proactively Mitigating Word-level Adversarial Attacks

Zhaoyang Wang; Zhiyue Liu; Xiaopeng Zheng; Qinliang Su; Jiahai Wang

Conference ProceedingsOPEN ACCESS

RMLM: A Flexible Defense Framework for Proactively Mitigating Word-level Adversarial Attacks

Proceedings of the Annual Meeting of the Association for Computational Linguistics (2023) 1 2757-2774

DOI: 10.18653/v1/2023.acl-long.155

6Citations

22Readers

Abstract

Adversarial attacks on deep neural networks keep raising security concerns in natural language processing research. Existing defenses focus on improving the robustness of the victim model in the training stage. However, they often neglect to proactively mitigate adversarial attacks during inference. Towards this overlooked aspect, we propose a defense framework that aims to mitigate attacks by confusing attackers and correcting adversarial contexts that are caused by malicious perturbations. Our framework comprises three components: (1) a synonym-based transformation to randomly corrupt adversarial contexts in the word level, (2) a developed BERT defender to correct abnormal contexts in the representation level, and (3) a simple detection method to filter out adversarial examples, any of which can be flexibly combined. Additionally, our framework helps improve the robustness of the victim model during training. Extensive experiments demonstrate the effectiveness of our framework in defending against word-level adversarial attacks.

Cite

CITATION STYLE

APA

Wang, Z., Liu, Z., Zheng, X., Su, Q., & Wang, J. (2023). RMLM: A Flexible Defense Framework for Proactively Mitigating Word-level Adversarial Attacks. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Vol. 1, pp. 2757–2774). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.acl-long.155

RMLM: A Flexible Defense Framework for Proactively Mitigating Word-level Adversarial Attacks

Abstract

Cite

Register to see more suggestions