Model-basedword embeddings from decompositions of count matrices

45Citations
Citations of this article
145Readers
Mendeley users who have this article in their library.

Abstract

This work develops a new statistical understanding of word embeddings induced from transformed count data. Using the class of hidden Markov models (HMMs) underlying Brown clustering as a generative model, we demonstrate how canonical correlation analysis (CCA) and certain count transformations permit efficient and effective recovery of model parameters with lexical semantics. We further show in experiments that these techniques empirically outperform existing spectral methods on word similarity and analogy tasks, and are also competitive with other popular methods such as WORD2VEC and GLOVE.

Cite

CITATION STYLE

APA

Stratos, K., Collins, M., & Hsu, D. (2015). Model-basedword embeddings from decompositions of count matrices. In ACL-IJCNLP 2015 - 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, Proceedings of the Conference (Vol. 1, pp. 1282–1291). Association for Computational Linguistics (ACL). https://doi.org/10.3115/v1/p15-1124

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free