A major goal of global gene expression profiling in plant seeds has been to investigate the parental contributions to the transcriptomes of early embryos and endosperm. However, consistency between independent studies has been poor, leading to considerable debate. We have developed a statistical tool that reveals the presence of substantial RNA contamination from maternal tissues in nearly all published Arabidopsis thaliana endosperm and early embryo transcriptomes generated in these studies. We demonstrate that maternal RNA contamination explains the poor reproducibility of these transcriptomic data sets. Furthermore, we found that RNA contamination from maternal tissues has been repeatedly misinterpreted as epigenetic phenomena, which has resulted in inaccurate conclusions regarding the parental contributions to both the endosperm and early embryo transcriptomes. After accounting for maternal RNA contamination, no published genome-wide data set supports the concept of delayed paternal genome activation in plant embryos. Moreover, our analysis suggests that maternal and paternal genomic imprinting are equally rare events in Arabidopsis endosperm. Our publicly available software (https://github.com/Gregor-Mendel-Institute/tissue-enrichment-test) can help the community assess the level of contamination in transcriptome data sets generated from both seed and non-seed tissues.
CITATION STYLE
Schon, M. A., & Nodine, M. D. (2017, April 1). Widespread contamination of arabidopsis embryo and endosperm transcriptome data sets. Plant Cell. American Society of Plant Biologists. https://doi.org/10.1105/tpc.16.00845
Mendeley helps you to discover research relevant for your work.