The distribution of genetic polymorphisms in a population contains information about evolutionary processes. The Poisson random field (PRF) model uses the polymorphism frequency spectrum to infer the mutation rate and the strength of directional selection. The PRF model relies on an infinite-sites approximation that is reasonable for most eukaryotic populations, but that becomes problematic when θ is large (θ ≳ 0.05). Here, we show that at large mutation rates characteristic of microbes and viruses the infinitesites approximation of the PRF model induces systematic biases that lead it to underestimate negative selection pressures and mutation rates and erroneously infer positive selection. We introduce two new methods that extend our ability to infer selection pressures and mutation rates at large θ: a finite-site modification of the PRF model and a new technique based on diffusion theory. Our methods can be used to infer not only a "weighted average" of selection pressures acting on a gene sequence, but also the distribution of selection pressures across sites. We evaluate the accuracy of our methods, as well that of the original PRF approach, by comparison with Wright-Fisher simulations. Copyright © 2008 by the Genetics Society of America.
CITATION STYLE
Desai, M. M., & Plotkin, J. B. (2008). The polymorphism frequency spectrum of finitely many sites under selection. Genetics, 180(4), 2175–2191. https://doi.org/10.1534/genetics.108.087361
Mendeley helps you to discover research relevant for your work.