Background: The statistical distribution of the similarity or difference between pairs of paralogous genes, created by whole genome doubling, or between pairs of orthologous genes in two related species is an important source of information about genomic evolution, especially in plants. Methods: We derive the mixture of distributions of sequence similarity for duplicate gene pairs generated by repeated episodes of whole gene doubling. This involves integrating sequence divergence and gene pair loss through fractionation, using a branching process and a mutational model. We account not only for the timing of these events in terms of local modes, but also the amplitude and variance of the component distributions. This model is then extended to orthologous gene pairs. Results: We apply the model and inference procedures to the evolution of the Solanaceae, focusing on the genomes of economically important crops. We assess how consistent or variable fractionation rates are from species to species and over time.
CITATION STYLE
Zhang, Y., Zheng, C., & Sankoff, D. (2019). A branching process for homology distribution-based inference of polyploidy, speciation and loss. Algorithms for Molecular Biology, 14(1). https://doi.org/10.1186/s13015-019-0153-8
Mendeley helps you to discover research relevant for your work.