Gene set bagging for estimating the probability a statistically significant result will replicate

7Citations
Citations of this article
38Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Background: Significance analysis plays a major role in identifying and ranking genes, transcription factor binding sites, DNA methylation regions, and other high-throughput features associated with illness. We propose a new approach, called gene set bagging, for measuring the probability that a gene set replicates in future studies. Gene set bagging involves resampling the original high-throughput data, performing gene-set analysis on the resampled data, and confirming that biological categories replicate in the bagged samples.Results: Using both simulated and publicly-available genomics data, we demonstrate that significant categories in a gene set enrichment analysis may be unstable when subjected to resampling. We show our method estimates the replication probability (R), the probability that a gene set will replicate as a significant result in future studies, and show in simulations that this method reflects replication better than each set's p-value.Conclusions: Our results suggest that gene lists based on p-values are not necessarily stable, and therefore additional steps like gene set bagging may improve biological inference on gene sets. © 2013 Jaffe et al.; licensee BioMed Central Ltd.

Cite

CITATION STYLE

APA

Jaffe, A. E., Storey, J. D., Ji, H., & Leek, J. T. (2013). Gene set bagging for estimating the probability a statistically significant result will replicate. BMC Bioinformatics, 14(1). https://doi.org/10.1186/1471-2105-14-360

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free