An efficient feature selection approach for clustering: Using a Gaussian mixture model of data dissimilarity

7Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Rapid advances in computer and database technologies have enabled organizations to accumulate vast amounts of data recently. These huge data make the data analysis task become more complicated. Feature selection is an effective dimensionality reduction technique by removing irrelevant, redundant, or noisy features. This research proposes a novel feature-selecting measure to evaluate feature importance for clustering process. The proposed measure aims at extracting useful information from the dissimilarity between two data objects since data dissimilarity is a common principle to determine whether data objects can be located within the same cluster or not. Therefore, the dissimilarity between a pair of data objects is used to develop the proposed feature-selecting measure. In the research, the probability distribution of the dissimilarity variable is considered as a mixture model consisting of the two "intra-cluster" and "inter-cluster" dissimilarity Gaussian distributions. The means of the two Gaussian distributions can be inferred by the EM algorithm. Accordingly, the difference between the two means is regarded as a meaningful measure to select important features for clustering. The effectiveness of the proposed featureselecting measure for clustering is demonstrated using a set of experiments. © Springer-Verlag Berlin Heidelberg 2007.

Cite

CITATION STYLE

APA

Tsai, C. Y., & Chiu, C. C. (2007). An efficient feature selection approach for clustering: Using a Gaussian mixture model of data dissimilarity. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4705 LNCS, pp. 1107–1118). Springer Verlag. https://doi.org/10.1007/978-3-540-74472-6_92

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free