Abstract
Recent work has shown that current text classification models are vulnerable to a small adversarial perturbation on inputs, and adversarial training that re-trains the models with the support of adversarial examples is the most popular way to alleviate the impact of the perturbation. However, current adversarial training methods have two principal problems: a drop in model's generalization and ineffective defending against other text attacks. In this paper, we propose a Keywordbias- aware Adversarial Text Generation model (KATG) that implicitly generates adversarial sentences using a generatordiscriminator structure. Instead of using a benign sentence to generate an adversarial sentence, the KATG model utilizes extra multiple benign sentences (namely prior sentences) to guide adversarial sentence generation. Furthermore, to cover more perturbations used in existing attacks, a keyword-biasbased sampling is proposed to select sentences containing biased words as prior sentences. Besides, to effectively utilize prior sentences, a generative flow mechanism is proposed to construct a latent semantic space for learning a latent representation of the prior sentences. Experiments demonstrate that adversarial sentences generated by our KATG model can strengthen the generalization and the robustness of text classification models.
Cite
CITATION STYLE
Shen, L., Li, S., & Chen, Y. (2022). KATG: Keyword-Bias-Aware Adversarial Text Generation for Text Classification. In Proceedings of the 36th AAAI Conference on Artificial Intelligence, AAAI 2022 (Vol. 36, pp. 11294–11302). Association for the Advancement of Artificial Intelligence. https://doi.org/10.1609/aaai.v36i10.21380
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.