Text dimensionality reduction for document clustering using hybrid memetic feature selection

3Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper, a document clustering method with a hybrid feature selection method is proposed. The proposed hybrid feature selection method integrates a Genetic-based wrapper method with ranking filter. The method is named Memetic Algorithm-Feature Selection (MA-FS). In this paper, MA-FS is combined with K-means and Spherical K-means (SK-means) clustering methods to perform document clustering. For the purpose of comparison, another unsupervised feature selection method, Feature Selection Genetic Text Clustering (FSGATC), is used. Two real-world criminal report document sets were used along with two popular benchmark datasets which are Reuters and 20newsgroup, were used in the comparisons. F-Micro, F-Macro and Average Distance of Document to Cluster (ADDC) measures were used for evaluation. The test results showed that the MA-FS method has outperformed the FSGATC method. It has also outperformed the results after using the entire feature space (ALL).

Cite

CITATION STYLE

APA

Al-Jadir, I., Wong, K. W., Fung, C. C., & Xie, H. (2017). Text dimensionality reduction for document clustering using hybrid memetic feature selection. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10607 LNAI, pp. 281–289). Springer Verlag. https://doi.org/10.1007/978-3-319-69456-6_23

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free