Parallel sentences mining with transfer learning in an unsupervised setting

8Citations
Citations of this article
45Readers
Mendeley users who have this article in their library.

Abstract

The quality and quantity of parallel sentences are known as very important training data for constructing neural machine translation (NMT) systems. However, these resources are not available for many low-resource language pairs. Many existing methods need strong supervision and hence are not suitable. Although there have been several attempts at developing unsupervised models, they ignore the language-invariant between languages. In this paper, we propose an approach based on transfer learning to mine parallel sentences in an unsupervised setting. With the help of bilingual corpora of rich-resource language pairs, we can mine parallel sentences without bilingual supervision of low-resource language pairs. Experiments show that our approach improves the performance of mined parallel sentences compared with previous methods. In particular, we achieve good results at two real-world low-resource language pairs.

Cite

CITATION STYLE

APA

Sun, Y., Zhu, S., Mi, C., & Feng, Y. (2021). Parallel sentences mining with transfer learning in an unsupervised setting. In NAACL-HLT 2021 - 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Student Research Workshop (pp. 136–142). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.naacl-srw.17

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free