Background: MEDLINE®/PubMed®currently indexes over 18 million biomedical articles, providing unprecedented opportunities and challenges for text analysis. Using Medical Subject Heading Over-representation Profiles (MeSHOPs), an entity of interest can be robustly summarized, quantitatively identifying associated biomedical terms and predicting novel indirect associations.Methods: A procedure is introduced for quantitative comparison of MeSHOPs derived from a group of MEDLINE®articles for a biomedical topic (for example, articles for a specific gene or disease). Similarity scores are computed to compare MeSHOPs of genes and diseases.Results: Similarity scores successfully infer novel associations between diseases and genes. The number of papers addressing a gene or disease has a strong influence on predicted associations, revealing an important bias for gene-disease relationship prediction. Predictions derived from comparisons of MeSHOPs achieves a mean 8% AUC improvement in the identification of gene-disease relationships compared to gene-independent baseline properties.Conclusions: MeSHOP comparisons are demonstrated to provide predictive capacity for novel relationships between genes and human diseases. We demonstrate the impact of literature bias on the performance of gene-disease prediction methods. MeSHOPs provide a rich source of annotation to facilitate relationship discovery in biomedical informatics. © 2012 Cheung et al.; licensee BioMed Central Ltd.
CITATION STYLE
Cheung, W. A., Francis Ouellette, B. F., & Wasserman, W. W. (2012). Inferring novel gene-disease associations using Medical Subject Heading Over-representation Profiles. Genome Medicine, 4(9). https://doi.org/10.1186/gm376
Mendeley helps you to discover research relevant for your work.