On-the-fly Cross-lingual Masking for Multilingual Pre-training

3Citations
Citations of this article
17Readers
Mendeley users who have this article in their library.

Abstract

In multilingual pre-training with the objective of MLM (masked language modeling) on multiple monolingual corpora, multilingual models only learn cross-linguality implicitly from isomorphic spaces formed by overlapping different language spaces due to the lack of explicit cross-lingual forward pass. In this work, we present CLPM (Cross-lingual Prototype Masking), a dynamic and token-wise masking scheme, for multilingual pre-training, using a special token [C]x to replace a random token x in the input sentence. [C]x is a cross-lingual prototype for x and then forms an explicit cross-lingual forward pass. We instantiate CLPM for the multilingual pre-training phase of UNMT (unsupervised neural machine translation), and experiments show that CLPM can consistently improve the performance of UNMT models on {De, Ro, Ne} ↔ En. Beyond UNMT or bilingual tasks, we show that CLPM can consistently improve the performance of multilingual models on cross-lingual classification.

Cite

CITATION STYLE

APA

Ai, X., & Fang, B. (2023). On-the-fly Cross-lingual Masking for Multilingual Pre-training. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Vol. 1, pp. 855–876). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.acl-long.49

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free