When the process underlying DNA substitutions varies across evolutionary history, some standard Markov models underlying phylogenetic methods are mathematically inconsistent. The most prominent example is the general time-reversible model (GTR) together with some, but not all, of its submodels. To rectify this deficiency, nonhomogeneous Lie Markov models have been identified as the class of models that are consistent in the face of a changing process of DNA substitutions regardless of taxon sampling. Some well-known models in popular use are within this class, but are either overly simplistic (e.g., the Kimura two-parameter model) or overly complex (the general Markov model). On a diverse set of biological data sets, we test a hierarchy of Lie Markov models spanning the full range of parameter richness. Compared against the benchmark of the ever-popular GTR model, we find that as a whole the Lie Markov models perform well, with the best performing models having 8-10 parameters and the ability to recognize the distinction between purines and pyrimidines.
Woodhams, M. D., Fernández-Sánchez, J., & Sumner, J. G. (2015). A New Hierarchy of Phylogenetic Models Consistent with Heterogeneous Substitution Rates. Systematic Biology, 64(4), 638–650. https://doi.org/10.1093/sysbio/syv021