Learning cross-lingualword embeddings via matrix co-factorization

48Citations
Citations of this article
147Readers
Mendeley users who have this article in their library.

Abstract

A joint-space model for cross-lingual distributed representations generalizes language-invariant semantic features. In this paper, we present a matrix cofactorization framework for learning cross-lingual word embeddings. We explicitly define monolingual training objectives in the form of matrix decomposition, and induce cross-lingual constraints for simultaneously factorizing monolingual matrices. The cross-lingual constraints can be derived from parallel corpora, with or without word alignments. Empirical results on a task of cross-lingual document classification show that our method is effective to encode cross-lingual knowledge as constraints for cross-lingual word embeddings.

Cite

CITATION STYLE

APA

Shi, T., Liu, Z., Liu, Y., & Sun, M. (2015). Learning cross-lingualword embeddings via matrix co-factorization. In ACL-IJCNLP 2015 - 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, Proceedings of the Conference (Vol. 2, pp. 567–572). Association for Computational Linguistics (ACL). https://doi.org/10.3115/v1/p15-2093

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free