Model-basedword embeddings from decompositions of count matrices

Karl Stratos; Michael Collins; Daniel Hsu

Conference ProceedingsOPEN ACCESS

Model-basedword embeddings from decompositions of count matrices

ACL-IJCNLP 2015 - 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, Proceedings of the Conference (2015) 1 1282-1291

DOI: 10.3115/v1/p15-1124

45Citations

145Readers

Abstract

This work develops a new statistical understanding of word embeddings induced from transformed count data. Using the class of hidden Markov models (HMMs) underlying Brown clustering as a generative model, we demonstrate how canonical correlation analysis (CCA) and certain count transformations permit efficient and effective recovery of model parameters with lexical semantics. We further show in experiments that these techniques empirically outperform existing spectral methods on word similarity and analogy tasks, and are also competitive with other popular methods such as WORD2VEC and GLOVE.

Cite

CITATION STYLE

APA

Stratos, K., Collins, M., & Hsu, D. (2015). Model-basedword embeddings from decompositions of count matrices. In ACL-IJCNLP 2015 - 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, Proceedings of the Conference (Vol. 1, pp. 1282–1291). Association for Computational Linguistics (ACL). https://doi.org/10.3115/v1/p15-1124

Model-basedword embeddings from decompositions of count matrices

Abstract

Cite

Register to see more suggestions