Short text classification based on semantics

2Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Data sparseness and unseen words are two major problems in short text classification. In such a case, it is unsuitable to directly use the vector space model (VSM) which focuses on the statistical occurrence of the terms to represent the text. To solve these problems, we present a novel short text classification method based on semantics. The method of K-Means is used to perform it. In the experiments, we exploit the continuous word embeddings which were trained on very large unrelated corpora to represent the semantic relationships. The experimental results on an open dataset show that the application of semantics greatly improves the performance in short text classification, comparing with a state-ofthe- art baseline in VSM; and that the proposed method can reduce the costs of collecting the training data.

Cite

CITATION STYLE

APA

Ma, C., Wan, X., Zhang, Z., Li, T., & Zhang, Y. (2015). Short text classification based on semantics. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9227, pp. 463–470). Springer Verlag. https://doi.org/10.1007/978-3-319-22053-6_49

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free