Improving Robustness of Language Models from a Geometry-aware Perspective

6Citations
Citations of this article
45Readers
Mendeley users who have this article in their library.

Abstract

Recent studies have found that removing the norm-bounded projection and increasing search steps in adversarial training can significantly improve robustness. However, we observe that a too large number of search steps can hurt accuracy. We aim to obtain strong robustness efficiently using fewer steps. Through a toy experiment, we find that perturbing the clean data to the decision boundary but not crossing it does not degrade the test accuracy. Inspired by this, we propose friendly adversarial data augmentation (FADA) to generate friendly adversarial data. On top of FADA, we propose geometry-aware adversarial training (GAT) to perform adversarial training on friendly adversarial data so that we can save a large number of search steps. Comprehensive experiments across two widely used datasets and three pretrained language models demonstrate that GAT can obtain stronger robustness via fewer steps. In addition, we provide extensive empirical results and in-depth analyses on robustness to facilitate future studies.

Cite

CITATION STYLE

APA

Zhu, B., Gu, Z., Wang, L., Chen, J., & Xuan, Q. (2022). Improving Robustness of Language Models from a Geometry-aware Perspective. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 3115–3125). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.findings-acl.246

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free