A Gaussian Mixture Model to Detect Clusters Embedded in Feature Subspace

Ming Dong; Jing Hua; Yuanhong Li

Journal ArticleOPEN ACCESS

A Gaussian Mixture Model to Detect Clusters Embedded in Feature Subspace

Dong M
Hua J
Li Y

Communications in Information and Systems (2007) 7(4) 337-352

DOI: 10.4310/cis.2007.v7.n4.a2

N/ACitations

7Readers

Abstract

The goal of unsupervised learning, i.e., clustering, is to determine the intrinsic structure of unlabeled data. Feature selection for clustering improves the performance of grouping by removing irrelevant features. Typical feature selection algorithms select a common feature subset for all the clusters. Consequently, clusters embedded in di?erent feature subspaces are not able to be identified. In this paper, we introduce a probabilistic model based on Gaussian mixture to solve this problem. Particularly, the feature relevance for an individual cluster is treated as a probability, which is represented by localized feature saliency and estimated through Expectation Maximization (EM) algorithm during the clustering process. In addition, the number of clusters is determined simultaneously by integrating a Minimum Message Length (MML) criterion. Experiments carriedon both synthetic and real-world datasets illustrate the performance of the proposed approach in finding clusters embedded in feature subspace.

Cite

CITATION STYLE

APA

Dong, M., Hua, J., & Li, Y. (2007). A Gaussian Mixture Model to Detect Clusters Embedded in Feature Subspace. Communications in Information and Systems, 7(4), 337–352. https://doi.org/10.4310/cis.2007.v7.n4.a2

A Gaussian Mixture Model to Detect Clusters Embedded in Feature Subspace

Abstract

Cite

Register to see more suggestions