Sign up & Download
Sign in

Bayes or bootstrap? A simulation study comparing the performance of Bayesian Markov chain Monte Carlo sampling and bootstrapping in assessing phylogenetic confidence.

by Michael E Alfaro, Stefan Zoller, François Lutzoni
Molecular Biology and Evolution ()

Abstract

Bayesian Markov chain Monte Carlo sampling has become increasingly popular in phylogenetics as a method for both estimating the maximum likelihood topology and for assessing nodal confidence. Despite the growing use of posterior probabilities, the relationship between the Bayesian measure of confidence and the most commonly used confidence measure in phylogenetics, the nonparametric bootstrap proportion, is poorly understood. We used computer simulation to investigate the behavior of three phylogenetic confidence methods: Bayesian posterior probabilities calculated via Markov chain Monte Carlo sampling (BMCMC-PP), maximum likelihood bootstrap proportion (ML-BP), and maximum parsimony bootstrap proportion (MP-BP). We simulated the evolution of DNA sequence on 17-taxon topologies under 18 evolutionary scenarios and examined the performance of these methods in assigning confidence to correct monophyletic and incorrect monophyletic groups, and we examined the effects of increasing character number on support value. BMCMC-PP and ML-BP were often strongly correlated with one another but could provide substantially different estimates of support on short internodes. In contrast, BMCMC-PP correlated poorly with MP-BP across most of the simulation conditions that we examined. For a given threshold value, more correct monophyletic groups were supported by BMCMC-PP than by either ML-BP or MP-BP. When threshold values were chosen that fixed the rate of accepting incorrect monophyletic relationship as true at 5%, all three methods recovered most of the correct relationships on the simulated topologies, although BMCMC-PP and ML-BP performed better than MP-BP. BMCMC-PP was usually a less biased predictor of phylogenetic accuracy than either bootstrapping method. BMCMC-PP provided high support values for correct topological bipartitions with fewer characters than was needed for nonparametric bootstrap.

Cite this document (BETA)

Available from mbe.oupjournals.org
Page 1
hidden

Bayes or bootstrap? A simulation ...

Bayes or Bootstrap? A Simulation Study Comparing the Performance of Bayesian Markov Chain Monte Carlo Sampling and Bootstrapping in Assessing Phylogenetic Confidence Michael E. Alfaro,* Stefan Zoller,�� and Franc ��ois Lutzoni�� *Evolution and Ecology, University of California, Davis and ��Department of Biology, Duke University Bayesian Markov chain Monte Carlo sampling has become increasingly popular in phylogenetics as a method for both estimating the maximum likelihood topology and for assessing nodal confidence. Despite the growing use of posterior probabilities, the relationship between the Bayesian measure of confidence and the most commonly used confidence measure in phylogenetics, the nonparametric bootstrap proportion, is poorly understood. We used computer simulation to investigate the behavior of three phylogenetic confidence methods: Bayesian posterior probabilities calculated via Markov chain Monte Carlo sampling (BMCMC-PP), maximum likelihood bootstrap proportion (ML-BP), and maximum parsimony bootstrap proportion (MP-BP). We simulated the evolution of DNA sequence on 17-taxon topologies under 18 evolutionary scenarios and examined the performance of these methods in assigning confidence to correct monophyletic and incorrect monophyletic groups, and we examined the effects of increasing character number on support value. BMCMC-PP and ML-BP were often strongly correlated with one another but could provide substantially different estimates of support on short internodes. In contrast, BMCMC-PP correlated poorly with MP-BP across most of the simulation conditions that we examined. For a given threshold value, more correct monophyletic groups were supported by BMCMC-PP than by either ML-BP or MP-BP. When threshold values were chosen that fixed the rate of accepting incorrect monophyletic relationship as true at 5%, all three methods recovered most of the correct relationships on the simulated topologies, although BMCMC-PP and ML-BP performed better than MP-BP. BMCMC-PP was usually a less biased predictor of phylogenetic accuracy than either bootstrapping method. BMCMC-PP provided high support values for correct topological bipartitions with fewer characters than was needed for nonparametric bootstrap. Introduction Confidence measures play an important role in phylogenetics, especially when trees serve as the con- ceptual framework for the study of trait evolution. These measures allow workers to identify trees or parts of a tree that are well supported by the data and thus adequate to serve as the basis for evolutionary inference of biological systems (Huelsenbeck, Rannala, and Masly 2000 Lutzoni et al. 2001 Pagel and Lutzoni 2002). Arguably the most commonly used confidence method in phylogenetics has been nonparametric bootstrapping, a statistical technique invented by Efron (1979) and first applied to the phylogeny problem by Felsenstein (1985). Phylogenetic nonparametric bootstrapping involves the random resam- pling (with replacement) of characters from the original data to generate pseudoreplicate data matrices identical in size to the original matrix. These pseudoreplicates are then subjected to the same phylogenetic searches as the original data set. Bootstrap support for a group of interest is calculated as the proportion of times that the group is obtained in the pseudoreplicates. The rationale for the resampling of the original matrix is that the distribution of the pseudoreplicates around the observed data is a valid approximation of the distribution of observed data sets on the true, unknown process that generates the data sets (Efron 1979 Efron Halloran, and Holmes 1996). In phylogenetic terms, this suggests that a monophyletic group that receives a high bootstrap proportion would be expected to be recovered by other analyses of new data sets that were generated by the same underlying process (Felsenstein 1985), and it is for this reason that the bootstrap is sometimes described as a measure of repeatability (Berry and Gascuel 1996 Felsenstein 1985 Hillis and Bull 1993). Hillis and Bull (1993) examined the performance of nonparametric bootstrapping as a measure of phylogenetic accuracy, that is, the probability that a given monophyletic group appears on the true tree. Their finding, that bootstrap proportions greater than 50% underestimated phylogenetic accuracy, sparked a flurry of papers that sought to clarify the interpretation of bootstrap ���P��� values (Felsenstein and Kishino 1993 Li and Zharkikh 1994 Sanderson 1995 Zharkikh and Li 1995 Berry and Gascuel 1996 Newton 1996). An important point that emerged from this scrutiny was that phylogenetic accuracy, sensu Hillis and Bull (1993), is not a quantity that bootstrapping typically tests and, furthermore, that bootstrapping may overestimate or underestimate phylogenetic accuracy depending on the condition under which the data were generated (e.g., Efron, Halloran, and Holmes 1996 Felsenstein and Kishino 1993). In addition, type I error, which is the quantity that many workers desire the bootstrap to reflect, is only approximated by the conventional bootstrap procedure (Efron, Halloran, and Holmes 1996). A more complex, two-step bootstrapping procedure is necessary to transform bootstrap proportions into standard frequentist confidence intervals. Despite these studies, conventional nonparametric bootstrapping is still widely viewed as providing a measure of phylogenetic accuracy (e.g., Murphy et al., 2001). Thus, nonparametric bootstrapping has been used to measure three quantities (Berry and Gascuel 1996): repeatability, the probability of observing a given result in future repeated sampling of the same underlying Key words: Bayesian Markov chain Monte Carlo, bootstrap, maximum parsimony, maximum likelihood, posterior probability, phylogenetic confidence, simulation. E-mail: malfaro@ucdavis.edu. Mol. Biol. Evol. 20(2):255���266. 2003 DOI: 10.1093/molbev/msg028 �� 2003 by the Society for Molecular Biology and Evolution. ISSN: 0737-4038 255
Page 2
hidden
character distribution accuracy (Hillis and Bull 1993), the probability that a given monophyletic group is present on the true tree and type I error rate (Felsenstein and Kishino 1993), assuming a null model of nonmonophyly. The theoretical justification for interpreting nonparametric bootstrap values as measures of repeatability is quite strong (Efron and Tibshirani 1993, 1998 Efron, Halloran, and Holmes 1996), and most of the debate over the bootstrap has focused on whether and how the bootstrap proportion can be meaningfully related to phylogenetic accuracy and frequentist testing (e.g., Sanderson 1995 Berry and Gascuel 1996). Threatening to add to the confusion over the interpretation of bootstrap values in phylogenetics is the increasingly widespread use of Bayesian methods to calculate the Bayesian confidence limits (posterior probabilities) for monophyletic relation- ships. Bayesian Confidence Methods Bayesian inference in phylogenetics has become increasingly common since its development in the late 1990s (see reviews in Huelsenbeck et al. 2001 Lewis 2001). Broadly speaking, in Bayesian inference one makes use of Bayes���s theorem to condition inferences about the value of some parameter of interest on the observed data. Bayesian inference focuses on the quantity known as the posterior probability, defined as the probability of some hypothesis conditional on the observed data. The posterior probability is proportional to the product of the likelihood of the data, given that the hypothesis is correct and the prior probability of the hypothesis before any data have been collected In Bayesian phylogenetics, parameters such as the tree topology, branch lengths, and substitution parameters, are modeled as probability distributions. Using Bayes���s theorem, the posterior probability of any of one of these parameters may be expressed as the marginal distribution of those remaining. Solving analytically for the posterior probability requires the integration of the likelihood function over all possible values of the remaining parameters, which is effectively intractable for even moderately complex problems. Modern Bayesian methods use Markov chain Monte Carlo methods to approximate this integration by simulating draws from the joint posterior distribution of all model parameters. Posterior probabilities for the parameters of interest are calculated using the Markov chain samples. For example, the posterior probability of a tree or bipartition in a tree is determined simply by examining the proportion of all of the Markov-chain samples that contain the topological bipartition of interest. The Meaning and Measure of Confidence Values Despite the growing popularity of Bayesian methods in phylogenetics (see Lewis 2001), there is no current consensus of how posterior probabilities should be interpreted relative to more traditional support measures such as the bootstrap. Efron, Halloran, and Holmes (1996) pointed out that bootstrap values correspond closely to posterior probabilities calculated under a multinomial model of site pattern frequency, and some workers have also implied that posterior probabilities derived from standard likelihood models of sequence evolution (i.e., those calculated by programs such as MrBayes [Huelsen- beck 2000] and BAMBE [Larget and Simon 1999]) are also closely equivalent to likelihood bootstrap proportions (e.g., Larget and Simon 1999). Others have noted that posterior probabilities are often much higher than the associated bootstrap proportion and have cited this as evidence that the Bayesian posterior probabilities do not suffer from the conservative bias that has been attributed to bootstrap values with regards to phylogenetic accuracy (Murphy et al. 2001). The purpose of the current study is to investigate the comparative behavior of nonparametric bootstrapping and Bayesian Markov chain Monte Carlo (BMCMC) methods in assigning confidence to phylogenetic results. Simula- tions are powerful tools for evaluating the performance of phylogenetics methods because the true tree and generat- ing model are known a priori (e.g., Hillis, Allard, and Miyamoto 1993). For this study, we chose to explore the performance of these methods under evolutionary scenar- ios that were designed to approximate a single gene study (1,000 base pairs) of a moderate number of taxa (17) rather than a simple four-taxon case in order to obtain a better understanding of how these methods perform under conditions more typical of real data sets. We quantified performance of support methods on a range of these topologies to address several fundamental questions about Bayesian posterior probabilities and bootstrap proportions. First, we compare how bootstrap and BMCMC assign confidence to the same correct internodes on a tree to determine if they are essentially equivalent techniques. Second, we compare the width of confidence envelopes for these two kinds of confidence by adopting the traditional interpretation of the bootstrap as a measure of repeatability and the posterior probability as the probability that a monophyletic group is correct. Third, we investigate the performance of these methods in estimating phyloge- netic accuracy and explore the consequence of construct- ing decision rules from support values on rates of type I error and on other performance benchmarks that we derive from our simulations. Finally, we compared the sensitivity FIG. 1.���Comparison between Bayesian and nonparametric bootstrap methods in assigning confidence to the same correct internodes on pectinate topologies. Shown are box plots, which indicate the 10%, 25%, median, 75%, and 90% interval boundaries of support for each of the 14 internodes on the indicated topology. Results from low-rate trees (0.08 expected changes per site as measured from the root of the tree to any tip) are in the first column, and results for high-rate trees (0.30 expected changes per site from root to tip of the tree) are in the second column. For each scenario, Bayesian Markov chain Monte Carlo posterior probabilities (BMCMC-PP) are shown in the top plot, followed by maximum likelihood bootstrap proportion values (ML-BP) and maximum parsimony bootstrap proportion values (MP-BP). Numbers in bold in lower right or lower middle of each graph indicate the median percentage of correct nodes (out of 14 possible) that received support 70%/ 95% over the 100 replicates. ! 256 Alfaro et al.

Readership Statistics

140 Readers on Mendeley
by Discipline
 
 
 
by Academic Status
 
22% Ph.D. Student
 
19% Post Doc
 
10% Assistant Professor
by Country
 
26% United States
 
18% Brazil
 
5% Portugal

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Already have an account? Sign in