Low-coverage whole-genome sequencing (WGS) is increasingly used for the study of evolution and ecology in both model and non-model organisms; however, effective application of low-coverage WGS data requires the implementation of probabilistic frameworks to account for the uncertainties in genotype likelihoods. Here, we present a probabilistic framework for using genotype likelihoods for standard population assignment applications. Additionally, we derive the Fisher information for allele frequency from genotype likelihoods and use that to describe a novel metric, the effective sample size, which figures heavily in assignment accuracy. We make these developments available for application through WGSassign, an open-source software package that is computationally efficient for working with whole-genome data. Using simulated and empirical data sets, we demonstrate the behaviour of our assignment method across a range of population structures, sample sizes and read depths. Through these results, we show that WGSassign can provide highly accurate assignment, even for samples with low average read depths (<0.01X) and among weakly differentiated populations. Our simulation results highlight the importance of equalizing the effective sample sizes among source populations in order to achieve accurate population assignment with low-coverage WGS data. We further provide study design recommendations for population assignment studies and discuss the broad utility of effective sample size for studies using low-coverage WGS data.
CITATION STYLE
DeSaix, M. G., Rodriguez, M. D., Ruegg, K. C., & Anderson, E. C. (2024). Population assignment from genotype likelihoods for low-coverage whole-genome sequencing data. Methods in Ecology and Evolution, 15(3), 493–510. https://doi.org/10.1111/2041-210X.14286
Mendeley helps you to discover research relevant for your work.