This article describes a new Markov chain Monte Carlo (MCMC) method applicable to DNA sequence data, which treats mutations in the genealogy as missing data. The method facilitates inferences regarding the age and identity, of specific mutations while taking the full complexities of the mutational process in DNA sequences into account. We demonstrate the utility, of the method in three applications. First, we demonstrate how the method can be used to make inferences regarding population genetical parameters such as 0 (the effective population size times the mutation rate). Second, we show how the method can be used to estimate the ages of mutations in finite sites models and for making inferences regarding the distribution and ages of nonsynonymous and synonymous mutations. The method is applied to two previously published data sets and we demonstrate that in one of the data sets the average age of nonsynonymous mutations is significantly lower than the average age of synonymous mutations, suggesting the presence of slightly deleterious mutations. Third, we demonstrate how the method in general can be used to evaluate the posterior distribution of a function of a mapping of mutations on a gene genealogy. This application is useful for evaluating the uncertainty associated with methods that rely on mapping mutations on a phylogeny or a gene genealogy.
CITATION STYLE
Nielsen, R. (2001). Mutations as missing data: Inferences on the ages and distributions of nonsynonymous and synonymous mutations. Genetics, 159(1), 401–411. https://doi.org/10.1093/genetics/159.1.401
Mendeley helps you to discover research relevant for your work.