A text clustering algorithm to detect basic level categories in texts

Jingyun Xu; Yi Cai; Shuai Wang; Kai Yang; Qing Du; Jun Zhang; Li Yao; Jingjing Li

Conference Proceedings

A text clustering algorithm to detect basic level categories in texts

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2017) 10473 LNCS 72-81

DOI: 10.1007/978-3-319-66733-1_8

0Citations

10Readers

Get full text

Abstract

With the rapid development of Internet and explosion of texts, an appropriate way to organize the amount of texts is necessary. Text clustering is of great practical importance for web-learning, which can group similar texts (e.g. documents, textbooks and online notes) to provide users with more valuable information. However, most of existing text clustering algorithms are very sensitive to the parameters needed to be input by users and it is hard to set an appropriate parameter as computers do not know what an appropriate parameter is. Therefore, aiming at this problem, according to the studies of cognitive psychology and our observation, this paper firstly introduces basic level categories and category utility, and then propose a text clustering algorithm to detect basic level categories in texts automatically, which is an non-parametric algorithm. The experimental results show that our algorithm significantly outperforms one basic level concept detection method, k-means and single linkage clustering on different datasets.

Author supplied keywords

Cite

CITATION STYLE

APA

Xu, J., Cai, Y., Wang, S., Yang, K., Du, Q., Zhang, J., … Li, J. (2017). A text clustering algorithm to detect basic level categories in texts. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10473 LNCS, pp. 72–81). Springer Verlag. https://doi.org/10.1007/978-3-319-66733-1_8

A text clustering algorithm to detect basic level categories in texts

Abstract

Author supplied keywords

Cite

Register to see more suggestions