More and more researchers in phylogenetics are concatenating gene sequences to produce supermatrices in the hope that larger data sets will lead to better phylogenetic resolution. Almost all of these supermatrices contain a high proportion of missing data which could potentially cause phylogenetic bias. Previous studies aiming to identify the missing-data-mediated bias in the maximum likelihood method have noted a bias associated with among-site rate variation. However, this finding is by sequence simulation and has been challenged by other simulation studies, with the controversy still unresolved. Here I illustrate analytically this bias caused by missing data coupled with among-site rate variation. This approach allows one to see how much the bias can contribute to likelihood differences among different topologies. The study highlights the point that, while supermatrices may lead to "robust" trees, such "robust" trees may be purchased with illegal phylogenetic currency. © 2014 Springer International Publishing Switzerland.
CITATION STYLE
Xia, X. (2014). Phylogenetic bias in the likelihood method caused by missing data coupled with among-site rate variation: An analytical approach. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8492 LNBI, pp. 12–23). Springer Verlag. https://doi.org/10.1007/978-3-319-08171-7_2
Mendeley helps you to discover research relevant for your work.