Document clustering based on a weighted exponential measurement

0Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Frequent terms sets clustering method has been proposed to overcome hardship of high dimensionality, and finding meaningful labels for clusters. Although this method provides meaningful labels for clusters, it has low accuracy. In this research, candidate clusters are extracted by mining frequent terms set within documents dataset. Each document is assigned to these clusters with considering the value of supports. A new similarity measurement function for clusters is designed based on similarity and weight of clusters and is proposed to remove unwanted clusters in a noise reduction step. The proposed method operates based on the concept of terms sets, value of support and weight of each cluster. Experimental results show that our proposed method provides more accurate clusters in comparison with previous efforts done on "Re0" and "Hitech" datasets. © 2014 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Taheri, S., Sim, A. T. H., & Ghorashi, S. H. (2014). Document clustering based on a weighted exponential measurement. In Lecture Notes in Electrical Engineering (Vol. 279 LNEE, pp. 65–70). Springer Verlag. https://doi.org/10.1007/978-3-642-41674-3_10

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free