Short text classification based on LDA topic model

75Citations
Citations of this article
87Readers
Mendeley users who have this article in their library.
Get full text

Abstract

As the rapid development of computer technology and network communication, short text data has increased enormously. Classifying the short text snippets is a great challenge to due to its less semantic information and high sparseness. In this paper, we proposed an improved short text classification method based on Latent Dirichlet Allocation topic model and K-Nearest Neighbor algorithm. The generated probabilistic topics help both make the texts more semantic-focused and reduce the sparseness. In addition, we present a novel topic similarity measure method with the topic-word matrix and the relationship of the discriminative terms between two short texts. A short text dataset for experiment validation is constructed by crawling the posts from Sina News website. The extensive and comparable experimental results obtained show the effectiveness of our proposed method.

Cite

CITATION STYLE

APA

Chen, Q., Yao, L., & Yang, J. (2017). Short text classification based on LDA topic model. In ICALIP 2016 - 2016 International Conference on Audio, Language and Image Processing - Proceedings (pp. 749–753). Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1109/ICALIP.2016.7846525

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free