Doge Tickets: Uncovering Domain-General Language Models by Playing Lottery Tickets

3Citations
Citations of this article
11Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Over-parameterized pre-trained language models (LMs), have shown an appealing expressive power due to their small learning bias. However, the huge learning capacity of LMs can also lead to large learning variance. In a pilot study, we find that, when faced with multiple domains, a critical portion of parameters behave unexpectedly in a domain-specific manner while others behave in a domain-general one. Motivated by this phenomenon, we for the first time posit that domain-general parameters can underpin a domain-general LM that can be derived from the original LM. To uncover the domain-general LM, we propose to identify domain-general parameters by playing lottery tickets (dubbed doge tickets). In order to intervene the lottery, we propose a domain-general score, which depicts how domain-invariant a parameter is by associating it with the variance. Comprehensive experiments are conducted on the Amazon, Mnli, and OntoNotes datasets. The results show that the doge tickets obtains an improved out-of-domain generalization in comparison with a range of competitive baselines. Analysis results further hint the existence of domain-general parameters and the performance consistency of doge tickets.

Cite

CITATION STYLE

APA

Yang, Y., Zhang, C., Wang, B., & Song, D. (2022). Doge Tickets: Uncovering Domain-General Language Models by Playing Lottery Tickets. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13551 LNAI, pp. 144–156). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-17120-8_12

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free