Improving bilingual lexicon induction with unsupervised post-processing of monolingual word vector spaces

Ivan Vulić; Anna Korhonen; Goran Glavaš

Conference ProceedingsOPEN ACCESS

Improving bilingual lexicon induction with unsupervised post-processing of monolingual word vector spaces

Proceedings of the Annual Meeting of the Association for Computational Linguistics (2020) 45-54

DOI: 10.18653/v1/2020.repl4nlp-1.7

6Citations

85Readers

Abstract

Work on projection-based induction of crosslingual word embedding spaces (CLWEs) predominantly focuses on the improvement of the projection (i.e., mapping) mechanisms. In this work, in contrast, we show that a simple method for post-processing monolingual embedding spaces facilitates learning of the crosslingual alignment and, in turn, substantially improves bilingual lexicon induction (BLI). The post-processing method we examine is grounded in the generalisation of first- and second-order monolingual similarities to the nth-order similarity. By post-processing monolingual spaces before the cross-lingual alignment, the method can be coupled with any projection-based method for inducing CLWE spaces. We demonstrate the effectiveness of this simple monolingual post-processing across a set of 15 typologically diverse languages (i.e., 15×14 BLI setups), and in combination with two different projection methods.

Cite

CITATION STYLE

APA

Vulić, I., Korhonen, A., & Glavaš, G. (2020). Improving bilingual lexicon induction with unsupervised post-processing of monolingual word vector spaces. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 45–54). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.repl4nlp-1.7

Improving bilingual lexicon induction with unsupervised post-processing of monolingual word vector spaces

Abstract

Cite

Register to see more suggestions