A novel semi-supervised short text classification algorithm based on fusion similarity

7Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

A novel semi-supervised classification algorithm for short text based on fusion similarity is presented via analyzing of existing defects of short text classification algorithm. First of all, some words with the ability of indication of the category are extracted from the labeled dataset to construct a strong category features set. A valid fusion similarity measurement method is designed by combining cosine theorem and strong category features based similarity. Secondly, computing the mean value of the supervised information, and determining the virtual class center point of each class, and then finding the real class center point. Finally, we search those texts which have the highest similarity with each real class center in the unlabeled dataset, and give it the same class label with the real class center point. At the same time, we add it to the labeled collection, update the strong category features set and the similarity matrix. Repeat this process until all short texts have been labeled. Ultimately, experiments show that our method can significantly improve the efficiency of short text classification. The text of the most similarity with the center of the class.

Cite

CITATION STYLE

APA

Li, X., Yan, L., Qin, N., & Ran, H. (2017). A novel semi-supervised short text classification algorithm based on fusion similarity. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10363 LNAI, pp. 309–319). Springer Verlag. https://doi.org/10.1007/978-3-319-63315-2_27

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free