Improving bilingual lexicon induction with unsupervised post-processing of monolingual word vector spaces

6Citations
Citations of this article
79Readers
Mendeley users who have this article in their library.

Abstract

Work on projection-based induction of crosslingual word embedding spaces (CLWEs) predominantly focuses on the improvement of the projection (i.e., mapping) mechanisms. In this work, in contrast, we show that a simple method for post-processing monolingual embedding spaces facilitates learning of the crosslingual alignment and, in turn, substantially improves bilingual lexicon induction (BLI). The post-processing method we examine is grounded in the generalisation of first- and second-order monolingual similarities to the nth-order similarity. By post-processing monolingual spaces before the cross-lingual alignment, the method can be coupled with any projection-based method for inducing CLWE spaces. We demonstrate the effectiveness of this simple monolingual post-processing across a set of 15 typologically diverse languages (i.e., 15×14 BLI setups), and in combination with two different projection methods.

Cite

CITATION STYLE

APA

Vulić, I., Korhonen, A., & Glavaš, G. (2020). Improving bilingual lexicon induction with unsupervised post-processing of monolingual word vector spaces. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 45–54). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.repl4nlp-1.7

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free