The accuracy of fuzzy C-means in lower-dimensional space for topic detection

7Citations
Citations of this article
14Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Topic detection is an automatic method to discover topics in textual data. The standard methods of the topic detection are nonnegative matrix factorization (NMF) and latent Dirichlet allocation (LDA). Another alternative method is a clustering approach such as a k-means and fuzzy c-means (FCM). FCM extend the k-means method in the sense that the textual data may have more than one topic. However, FCM works well for low-dimensional textual data and fails for high-dimensional textual data. An approach to overcome the problem is transforming the textual data into lower dimensional space, i.e., Eigenspace, and called Eigenspace-based FCM (EFCM). Firstly, the textual data are transformed into an Eigenspace using truncated singular value decomposition. FCM is performed on the eigenspace data to identify the memberships of the textual data in clusters. Using these memberships, we generate topics from the high dimensional textual data in the original space. In this paper, we examine the accuracy of EFCM for topic detection. Our simulations show that EFCM results in the accuracies between the accuracies of LDA and NMF regarding both topic interpretation and topic recall.

Cite

CITATION STYLE

APA

Murfi, H. (2018). The accuracy of fuzzy C-means in lower-dimensional space for topic detection. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11344 LNCS, pp. 321–334). Springer Verlag. https://doi.org/10.1007/978-3-030-05755-8_32

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free