Motivation: Large-scale phenotyping projects such as the Sanger Mouse Genetics project are ongoing efforts to help identify the influences of genes and their modification on phenotypes. Gene-phenotype relations are crucial to the improvement of our understanding of human heritable diseases as well as the development of drugs. However, given that there are ∼20 000 genes in higher vertebrate genomes and the experimental verification of gene-phenotype relations requires a lot of resources, methods are needed that determine good candidates for testing. Results: In this study, we applied an association rule mining approach to the identification of promising secondary phenotype candidates. The predictions rely on a large gene-phenotype annotation set that is used to find occurrence patterns of phenotypes. Applying an association rule mining approach, we could identify 1967 secondary phenotype hypotheses that cover 244 genes and 136 phenotypes. Using two automated and one manual evaluation strategies, we demonstrate that the secondary phenotype candidates possess biological relevance to the genes they are predicted for. From the results we conclude that the predicted secondary phenotypes constitute good candidates to be experimentally tested and confirmed. © 2014 The Author. Published by Oxford University Press. All rights reserved.
CITATION STYLE
Oellrich, A., Jacobsen, J., Papatheodorou, I., & Smedley, D. (2014). Using association rule mining to determine promising secondary phenotyping hypotheses. Bioinformatics, 30(12). https://doi.org/10.1093/bioinformatics/btu260
Mendeley helps you to discover research relevant for your work.