Short text classification based on semantics

Chenglong Ma; Xin Wan; Zhen Zhang; Taisong Li; Yan Zhang

Conference Proceedings

Short text classification based on semantics

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2015) 9227 463-470

DOI: 10.1007/978-3-319-22053-6_49

2Citations

6Readers

Get full text

Abstract

Data sparseness and unseen words are two major problems in short text classification. In such a case, it is unsuitable to directly use the vector space model (VSM) which focuses on the statistical occurrence of the terms to represent the text. To solve these problems, we present a novel short text classification method based on semantics. The method of K-Means is used to perform it. In the experiments, we exploit the continuous word embeddings which were trained on very large unrelated corpora to represent the semantic relationships. The experimental results on an open dataset show that the application of semantics greatly improves the performance in short text classification, comparing with a state-ofthe- art baseline in VSM; and that the proposed method can reduce the costs of collecting the training data.

Author supplied keywords

Cite

CITATION STYLE

APA

Ma, C., Wan, X., Zhang, Z., Li, T., & Zhang, Y. (2015). Short text classification based on semantics. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9227, pp. 463–470). Springer Verlag. https://doi.org/10.1007/978-3-319-22053-6_49

Short text classification based on semantics

Abstract

Author supplied keywords

Cite

Register to see more suggestions