Learning bilingual lexicon from monolingual data is a novel idea in natural language process which can benefit many low-resource language pairs. In this paper, we present an approach for obtaining bilingual lexicon from monolingual data. Our method only requires a small seed bilingual lexicon and we use the Canonical Correlation Analysis to construct a shared latent space to explain two monolingual embeddings how to be linked. Experimental results show that a considerable precision and size bilingual lexicon can be learned in Chinese-Uyghur and Chinese-Kazakh monolingual data.
CITATION STYLE
Zhu, S. L., Li, X., Yang, Y. T., Wang, L., & Mi, C. G. (2018). Learning Bilingual Lexicon for Low-Resource Language Pairs. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10619 LNAI, pp. 760–770). Springer Verlag. https://doi.org/10.1007/978-3-319-73618-1_66
Mendeley helps you to discover research relevant for your work.