The accuracy of fuzzy C-means in lower-dimensional space for topic detection

Hendri Murfi

Conference Proceedings

The accuracy of fuzzy C-means in lower-dimensional space for topic detection

Murfi H

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 11344 LNCS 321-334

DOI: 10.1007/978-3-030-05755-8_32

7Citations

14Readers

Get full text

Abstract

Topic detection is an automatic method to discover topics in textual data. The standard methods of the topic detection are nonnegative matrix factorization (NMF) and latent Dirichlet allocation (LDA). Another alternative method is a clustering approach such as a k-means and fuzzy c-means (FCM). FCM extend the k-means method in the sense that the textual data may have more than one topic. However, FCM works well for low-dimensional textual data and fails for high-dimensional textual data. An approach to overcome the problem is transforming the textual data into lower dimensional space, i.e., Eigenspace, and called Eigenspace-based FCM (EFCM). Firstly, the textual data are transformed into an Eigenspace using truncated singular value decomposition. FCM is performed on the eigenspace data to identify the memberships of the textual data in clusters. Using these memberships, we generate topics from the high dimensional textual data in the original space. In this paper, we examine the accuracy of EFCM for topic detection. Our simulations show that EFCM results in the accuracies between the accuracies of LDA and NMF regarding both topic interpretation and topic recall.

Author supplied keywords

Cite

CITATION STYLE

APA

Murfi, H. (2018). The accuracy of fuzzy C-means in lower-dimensional space for topic detection. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11344 LNCS, pp. 321–334). Springer Verlag. https://doi.org/10.1007/978-3-030-05755-8_32

The accuracy of fuzzy C-means in lower-dimensional space for topic detection

Abstract

Author supplied keywords

Cite

Register to see more suggestions