A random forest approach to capture genetic effects in the presence of population structure

68Citations
Citations of this article
227Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

The accurate mapping of causal variants in genome-wide association studies requires the consideration of both, confounding factors (for example, population structure) and nonlinear interactions between individual genetic variants. Here, we propose a method termed mixed random forest that simultaneously accounts for population structure and captures nonlinear genetic effects. We test the model in simulation experiments and show that the mixed random forest approach improves detection power compared with established approaches. In an application to data from an outbred mouse population, we find that mixed random forest identifies associations that are more consistent with prior knowledge than competing methods. Further, our approach allows predicting phenotypes from genotypes with greater accuracy than any of the other methods that we tested. Our results show that approaches that simultaneously account for both, confounding due to population structure and epistatic interactions, are important to fully explain the heritable component of complex quantitative traits.

Cite

CITATION STYLE

APA

Stephan, J., Stegle, O., & Beyer, A. (2015). A random forest approach to capture genetic effects in the presence of population structure. Nature Communications, 6. https://doi.org/10.1038/ncomms8432

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free