Topic models incorporating statistical word senses

3Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

LDA considers a surface word to be identical across all documents and measures the contribution of a surface word to each topic. However, a surface word may present different signatures in different contexts, i.e. polysemous words can be used with different senses in different contexts. Intuitively, disambiguating word senses for topic models can enhance their discriminative capabilities. In this work, we propose a joint model to automatically induce document topics and word senses simultaneously. Instead of using some pre-defined word sense resources, we capture the word sense information via a latent variable and directly induce them in a fully unsupervised manner from the corpora. Experimental results show that the proposed joint model outperforms the classic LDA and a standalone sense-based LDA model significantly in document clustering. © 2014 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Tang, G., Xia, Y., Sun, J., Zhang, M., & Zheng, T. F. (2014). Topic models incorporating statistical word senses. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8403 LNCS, pp. 151–162). Springer Verlag. https://doi.org/10.1007/978-3-642-54906-9_13

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free