On the identification of correlated differential features for supervised classification of high-dimensional data

1Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Many real problems in supervised classification involve high-dimensional feature data measured for individuals of known origin from two or more classes. When the dimension of the feature vector is very large relative to the number of individuals, it presents formidable challenges to construct a discriminant rule (classifier) for assigning an unclassified individual to one of the known classes. One way to handle this high-dimensional problem is to identify highly relevant differential features for constructing a classifier. Here a new approach is considered, where a mixture model with random effects is used firstly to partition the features into clusters and then the relevance of each feature variable for differentiating the classes is formally tested and ranked using cluster-specific contrasts of mixed effects. Finally, a non-parametric clustering approach is adopted to identify networks of differential features that are highly correlated. The method is illustrated using a publicly available data set in cancer research for the discovery of correlated biomarkers relevant to the cancer diagnosis and prognosis.

Cite

CITATION STYLE

APA

Ng, S. K., & McLachlan, G. J. (2017). On the identification of correlated differential features for supervised classification of high-dimensional data. In Studies in Classification, Data Analysis, and Knowledge Organization (Vol. 0, pp. 43–57). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-319-55723-6_4

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free