Precision annotation of digital samples in NCBI's gene expression omnibus

36Citations
Citations of this article
74Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

The Gene Expression Omnibus (GEO) contains more than two million digital samples from functional genomics experiments amassed over almost two decades. However, individual sample meta-data remains poorly described by unstructured free text attributes preventing its largescale reanalysis. We introduce the Search Tag Analyze Resource for GEO as a web application (http://STARGEO.org) to curate better annotations of sample phenotypes uniformly across different studies, and to use these sample annotations to define robust genomic signatures of disease pathology by meta-analysis. In this paper, we target a small group of biomedical graduate students to show rapid crowd-curation of precise sample annotations across all phenotypes, and we demonstrate the biological validity of these crowd-curated annotations for breast cancer. STARGEO.org makes GEO data findable, accessible, interoperable and reusable (i.e., FAIR) to ultimately facilitate knowledge discovery. Our work demonstrates the utility of crowd-curation and interpretation of open 'big data' under FAIR principles as a first step towards realizing an ideal paradigm of precision medicine.

Cite

CITATION STYLE

APA

Hadley, D., Pan, J., El-Sayed, O., Aljabban, J., Aljabban, I., Azad, T. D., … Butte, A. J. (2017). Precision annotation of digital samples in NCBI’s gene expression omnibus. Scientific Data, 4. https://doi.org/10.1038/sdata.2017.125

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free