New genome similarity measures based on conserved gene adjacencies

Luis Antonio B. Kowada; Daniel Doerr; Simone Dantas; Jens Stoye

Conference Proceedings

New genome similarity measures based on conserved gene adjacencies

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2016) 9649 204-224

DOI: 10.1007/978-3-319-31957-5_15

4Citations

8Readers

Get full text

Abstract

Many important questions in molecular biology, evolution and biomedicine can be addressed by comparative genomics approaches. One of the basic tasks when comparing genomes is the definition of measures of similarity (or dissimilarity) between two genomes, for example to elucidate the phylogenetic relationships between species. The power of different genome comparison methods varies with the underlying formal model of a genome. The simplest models impose the strong restriction that each genome under study must contain the same genes, each in exactly one copy. More realistic models allow several copies of a gene in a genome. One speaks of gene families, and comparative genomics methods that allow this kind of input are called gene familybased. The most powerful-but also most complex-models avoid this preprocessing of the input data and instead integrate the family assignment within the comparative analysis. Such methods are called gene family-free. In this paper, we study an intermediate approach between familybased and family-free genomic similarity measures. The model, called gene connections, is on the one hand more flexible than the family-based model, on the other hand the resulting data structure is less complex than in the family-free approach. This intermediate status allows us to achieve results comparable to those for family-free methods, but at running times similar to those for the family-based approach. Within the gene connection model, we define three variants of genomic similarity measures that have different expression power. We give polynomial-time algorithms for two of them, while we show NPhardness of the third, most powerful one.We also generalize the measures and algorithms to make them more robust against recent local disruptions in gene order. Our theoretical findings are supported by experimental results, proving the applicability and performance of our newly defined similarity measures.

Cite

CITATION STYLE

APA

Kowada, L. A. B., Doerr, D., Dantas, S., & Stoye, J. (2016). New genome similarity measures based on conserved gene adjacencies. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9649, pp. 204–224). Springer Verlag. https://doi.org/10.1007/978-3-319-31957-5_15

New genome similarity measures based on conserved gene adjacencies

Abstract

Cite

Register to see more suggestions