Relevant gene selection using normalized cut clustering with maximal compression similarity measure

5Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Microarray cancer classification has drawn attention of research community for better clinical diagnosis in last few years. Microarray datasets are characterized by high dimension and small sample size. To avoid curse of dimensionality good feature selection methods are needed. Here, we propose a two stage algorithm for finding a small subset of relevant genes responsible for classification in high dimensional microarray datasets. In first stage of algorithm, the entire feature space is divided into k clusters using normalized cut. Similarity measure used for clustering is maximal information compression index. The informative gene is selected from each cluster using t-statistics and a pool of non redundant genes is created. In second stage a wrapper based forward feature selection method is used to obtain a set of optimal genes for a given classifier. The proposed algorithm is tested on three well known bdatasets from Kent Ridge Biomedical Data Repository. Comparison with other state of art methods shows that our proposed algorithm is able to achieve better classification accuracy with less number of features. © 2010 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Bala, R., Agrawal, R. K., & Sardana, M. (2010). Relevant gene selection using normalized cut clustering with maximal compression similarity measure. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6119 LNAI, pp. 81–88). https://doi.org/10.1007/978-3-642-13672-6_9

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free