Are girls neko or shojo? Cross-lingual alignment of non-isomorphic embeddings with iterative normalization

25Citations
Citations of this article
147Readers
Mendeley users who have this article in their library.

Abstract

Cross-lingual word embeddings (CLWE) underlie many multilingual natural language processing systems, often through orthogonal transformations of pre-trained monolingual embeddings. However, orthogonal mapping only works on language pairs whose embeddings are naturally isomorphic. For non-isomorphic pairs, our method (Iterative Normalization) transforms monolingual embeddings to make orthogonal alignment easier by simultaneously enforcing that (1) individual word vectors are unit length, and (2) each language's average vector is zero. Iterative Normalization consistently improves word translation accuracy of three CLWE methods, with the largest improvement observed on English-Japanese (from 2% to 44% test accuracy).

Cite

CITATION STYLE

APA

Zhang, M., Xu, K., Kawarabayashi, K. I., Jegelka, S., & Boyd-Graber, J. (2020). Are girls neko or shojo? Cross-lingual alignment of non-isomorphic embeddings with iterative normalization. In ACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (pp. 3180–3189). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/p19-1307

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free