Cautions about the reliability of pairwise gene correlations based on expression data

5Citations
Citations of this article
17Readers
Mendeley users who have this article in their library.

Abstract

© 2015 Powers, DeJongh, Best and Tintle. Background: Rapid growth in the availability of genome-wide transcript abundance levels through gene expression microarrays and RNAseq promises to provide deep biological insights into the complex, genome-wide transcriptional behavior of single-celled organisms. However, this promise has not yet been fully realized. Results: We find that computation of pairwise gene associations (correlation; mutual information) across a set of 2782 total genome-wide expression samples from six diverse bacteria produces unexpectedly large variation in estimates of pairwise gene association-regardless of the metric used, the organism under study, or the number and source of the samples. We pinpoint the cause to sampling bias. In particular, in repositories of expression data (e.g., Gene Expression Omnibus, GEO), many individual genes show small differences in absolute gene expression levels across the set of samples. We demonstrate that these small differences are due mainly to "noise" instead of "signal" attributable to environmental or genetic perturbations. We show that downstream analysis using gene expression levels of genes with small differences yields biased estimates of pairwise association. Conclusions: We propose flagging genes with small differences in absolute, RMA-normalized, expression levels (e.g., standard deviation less than 0.5), as potentially yielding biased pairwise association metrics. This strategy has the potential to substantially improve the confidence in genome-wide conclusions about transcriptional behavior in bacterial organisms. Further work is needed to further refine strategies to identify genes with small difference in expression levels prior to computing gene-gene association metrics.

Cite

CITATION STYLE

APA

Powers, S., DeJongh, M., Best, A. A., & Tintle, N. L. (2015). Cautions about the reliability of pairwise gene correlations based on expression data. Frontiers in Microbiology, 6(JUN). https://doi.org/10.3389/fmicb.2015.00650

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free