Nonparametric variable selection, clustering and prediction for large biological datasets

0Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The development of parsimonious models for reliable inference and prediction of responses in high-dimensional regression settings is often challenging due to relatively small sample sizes and the presence of complex interaction patterns between a large number of covariates. We propose an efficient, nonparametric framework for simultaneous variable selection, clustering and prediction in high-throughput regression settings with continuous outcomes. The proposed model utilizes the sparsity induced by Poisson-Dirichlet processes (PDPs) to group the covariates into lower-dimensional latent clusters consisting of covariates with similar patterns among the samples. The data are permitted to direct the choice of a suitable cluster allocation scheme, choosing between PDPs and their special case, a Dirichlet process. Subsequently, the latent clusters are used to build a nonlinear prediction model for the responses using an adaptive mixture of linear and nonlinear elements, thus achieving a balance between model parsimony and flexibility. Through analyses of gene expression microarray datasets we demonstrate the reliability of the proposed method's clustering mechanism and show that the technique compares favorably to, and often outperforms, existing methodologies in terms of the prediction accuracies of the subject-specific clinical outcomes.

Cite

CITATION STYLE

APA

Guha, S., Banerjee, S., Gu, C., & Baladandayuthapani, V. (2015). Nonparametric variable selection, clustering and prediction for large biological datasets. In Nonparametric Bayesian Inference in Biostatistics (pp. 175–192). Springer International Publishing. https://doi.org/10.1007/978-3-319-19518-6_9

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free