A novel semi-supervised short text classification algorithm based on fusion similarity

Xiaohong Li; Li Yan; Na Qin; Hongyan Ran

Conference Proceedings

A novel semi-supervised short text classification algorithm based on fusion similarity

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2017) 10363 LNAI 309-319

DOI: 10.1007/978-3-319-63315-2_27

7Citations

7Readers

Get full text

Abstract

A novel semi-supervised classification algorithm for short text based on fusion similarity is presented via analyzing of existing defects of short text classification algorithm. First of all, some words with the ability of indication of the category are extracted from the labeled dataset to construct a strong category features set. A valid fusion similarity measurement method is designed by combining cosine theorem and strong category features based similarity. Secondly, computing the mean value of the supervised information, and determining the virtual class center point of each class, and then finding the real class center point. Finally, we search those texts which have the highest similarity with each real class center in the unlabeled dataset, and give it the same class label with the real class center point. At the same time, we add it to the labeled collection, update the strong category features set and the similarity matrix. Repeat this process until all short texts have been labeled. Ultimately, experiments show that our method can significantly improve the efficiency of short text classification. The text of the most similarity with the center of the class.

Author supplied keywords

Cite

CITATION STYLE

APA

Li, X., Yan, L., Qin, N., & Ran, H. (2017). A novel semi-supervised short text classification algorithm based on fusion similarity. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10363 LNAI, pp. 309–319). Springer Verlag. https://doi.org/10.1007/978-3-319-63315-2_27

A novel semi-supervised short text classification algorithm based on fusion similarity

Abstract

Author supplied keywords

Cite

Register to see more suggestions