Meta-analysis based variable selection for gene expression data

28Citations
Citations of this article
37Readers
Mendeley users who have this article in their library.

Abstract

Summary: Recent advance in biotechnology and its wide applications have led to the generation of many high-dimensional gene expression data sets that can be used to address similar biological questions. Meta-analysis plays an important role in summarizing and synthesizing scientific evidence from multiple studies. When the dimensions of datasets are high, it is desirable to incorporate variable selection into meta-analysis to improve model interpretation and prediction. According to our knowledge, all existing methods conduct variable selection with meta-analyzed data in an "all-in-or-all-out" fashion, that is, a gene is either selected in all of studies or not selected in any study. However, due to data heterogeneity commonly exist in meta-analyzed data, including choices of biospecimens, study population, and measurement sensitivity, it is possible that a gene is important in some studies while unimportant in others. In this article, we propose a novel method called meta-lasso for variable selection with high-dimensional meta-analyzed data. Through a hierarchical decomposition on regression coefficients, our method not only borrows strength across multiple data sets to boost the power to identify important genes, but also keeps the selection flexibility among data sets to take into account data heterogeneity. We show that our method possesses the gene selection consistency, that is, when sample size of each data set is large, with high probability, our method can identify all important genes and remove all unimportant genes. Simulation studies demonstrate a good performance of our method. We applied our meta-lasso method to a meta-analysis of five cardiovascular studies. The analysis results are clinically meaningful.

References Powered by Scopus

Regression Shrinkage and Selection Via the Lasso

35917Citations
N/AReaders
Get full text

Regularization and variable selection via the elastic net

13194Citations
N/AReaders
Get full text

Regularization paths for generalized linear models via coordinate descent

12327Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Meta-analytic support vector machine for integrating multiple omics data

117Citations
N/AReaders
Get full text

Statistical Methods in Integrative Genomics

79Citations
N/AReaders
Get full text

Dynamic and modularized MicroRNA regulation and its implication in human cancers

60Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Li, Q., Wang, S., Huang, C. C., Yu, M., & Shao, J. (2014). Meta-analysis based variable selection for gene expression data. Biometrics, 70(4), 872–880. https://doi.org/10.1111/biom.12213

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 14

50%

Researcher 9

32%

Professor / Associate Prof. 4

14%

Lecturer / Post doc 1

4%

Readers' Discipline

Tooltip

Computer Science 10

38%

Mathematics 8

31%

Medicine and Dentistry 4

15%

Biochemistry, Genetics and Molecular Bi... 4

15%

Save time finding and organizing research with Mendeley

Sign up for free