As Good as New. How to Successfully Recycle English GPT-2 to Make Models for Other Languages

55Citations
Citations of this article
109Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Large generative language models have been very successful for English, but other languages lag behind, in part due to data and computational limitations. We propose a method that may overcome these problems by adapting existing pre-trained models to new languages. Specifically, we describe the adaptation of English GPT-2 to Italian and Dutch by retraining lexical embeddings without tuning the Transformer layers. As a result, we obtain lexical embeddings for Italian and Dutch that are aligned with the original English lexical embeddings. Additionally, we scale up complexity by transforming relearned lexical embeddings of GPT-2 small to the GPT-2 medium embedding space. This method minimises the amount of training and prevents losing information during adaptation that was learned by GPT-2. English GPT-2 models with relearned lexical embeddings can generate realistic sentences in Italian and Dutch. Though on average these sentences are still identifiable as artificial by humans, they are assessed on par with sentences generated by a GPT-2 model fully trained from scratch.

Cite

CITATION STYLE

APA

de Vries, W., & Nissim, M. (2021). As Good as New. How to Successfully Recycle English GPT-2 to Make Models for Other Languages. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 (pp. 836–846). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.findings-acl.74

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free