Adversarial Self-Attention for Language Understanding

8Citations
Citations of this article
17Readers
Mendeley users who have this article in their library.

Abstract

Deep neural models (e.g. Transformer) naturally learn spurious features, which create a “shortcut” between the labels and inputs, thus impairing the generalization and robustness. This paper advances the self-attention mechanism to its robust variant for Transformer-based pre-trained language models (e.g. BERT). We propose Adversarial Self-Attention mechanism (ASA), which adversarially biases the attentions to effectively suppress the model reliance on features (e.g. specific keywords) and encourage its exploration of broader semantics. We conduct a comprehensive evaluation across a wide range of tasks for both pre-training and fine-tuning stages. For pre-training, ASA unfolds remarkable performance gains compared to naive training for longer steps. For fine-tuning, ASA-empowered models outweigh naive models by a large margin considering both generalization and robustness.

Cite

CITATION STYLE

APA

Wu, H., Ding, R., Zhao, H., Xie, P., Huang, F., & Zhang, M. (2023). Adversarial Self-Attention for Language Understanding. In Proceedings of the 37th AAAI Conference on Artificial Intelligence, AAAI 2023 (Vol. 37, pp. 13727–13735). AAAI Press. https://doi.org/10.1609/aaai.v37i11.26608

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free