Multi-Granularity Contrasting for Cross-Lingual Pre-Training

7Citations
Citations of this article
49Readers
Mendeley users who have this article in their library.

Abstract

Cross-lingual pre-training aims at providing effective prior representations for the inputs from multiple languages. With the modeling of bidirectional contexts, recently prevalent language modeling approaches such as XLM achieve better performance than traditional methods based on embedding alignment, which strives to assign similar vector representations to semantic-equivalent units. However, such approaches like XLM capture cross-lingual information based solely on shared BPE vocabulary, resulting in the absence of fine-grained supervision induced by embedding alignment. Inheriting the advantages of the above two paradigms, this work presents a multi-granularity contrasting framework, namely MGC, to learn language-universal representations. While predicting the masked words based on bidirectional contexts, the proposal also encodes semantic equivalents from different languages into similar representations to introduce more fine-grained and explicit cross-lingual information. Two effective contrasting strategies are further proposed, which can be built upon semantic units of multiple granularities covering words, span, and sentences. Extensive experiments demonstrate that our approach can achieve significant performance gains in various downstream tasks, including machine translation and cross-lingual language understanding.

Cite

CITATION STYLE

APA

Li, S., Yang, P., Luo, F., & Xie, J. (2021). Multi-Granularity Contrasting for Cross-Lingual Pre-Training. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 (pp. 1708–1717). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.findings-acl.149

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free