Robust Lottery Tickets for Pre-trained Language Models

21Citations
Citations of this article
48Readers
Mendeley users who have this article in their library.

Abstract

Recent works on Lottery Ticket Hypothesis have shown that pre-trained language models (PLMs) contain smaller matching subnetworks (winning tickets) which are capable of reaching accuracy comparable to the original models. However, these tickets are proved to be not robust to adversarial examples, and even worse than their PLM counterparts. To address this problem, we propose a novel method based on learning binary weight masks to identify robust tickets hidden in the original PLMs. Since the loss is not differentiable for the binary mask, we assign the hard concrete distribution to the masks and encourage their sparsity using a smoothing approximation of L0 regularization. Furthermore, we design an adversarial loss objective to guide the search for robust tickets and ensure that the tickets perform well both in accuracy and robustness. Experimental results show the significant improvement of the proposed method over previous work on adversarial robustness evaluation.

Cite

CITATION STYLE

APA

Zheng, R., Bao, R., Zhou, Y., Liang, D., Wang, S., Wu, W., … Huang, X. (2022). Robust Lottery Tickets for Pre-trained Language Models. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Vol. 1, pp. 2211–2224). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.acl-long.157

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free