Genomes containing duplicates are hard to compare

17Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

In this paper, we are interested in the algorithmic complexity of computing (dis)similarity measures between two genomes when they contain duplicated genes. In that case, there are usually two main ways to compute a given (dis)similarity measure M between two genomes G 1 and G 2 : the first model, that we will call the matching model, consists in computing a one-to-one correspondence between genes of G 1 and genes of G 2, in such a way that M is optimized in the resulting permutation. The second model, called the exemplar model, consists in keeping in G 1 (resp. G 2) exactly one copy of each gene, thus deleting all the other copies, in such a way that M is optimized in the resulting permutation. We present here different results concerning the algorithmic complexity of computing three different similarity measures (number of common intervals, MAD number and SAD number) in those two models, basically showing that the problem becomes NP-complete for each of them as soon as genomes contain duplicates. In the case of MAD and SAD, we actually prove that, under both models, both MAD and SAD problems are APX-hard. © Springer-Verlag Berlin Heidelberg 2006.

Cite

CITATION STYLE

APA

Chauve, C., Fertin, G., Rizzi, R., & Vialette, S. (2006). Genomes containing duplicates are hard to compare. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 3992 LNCS-II, pp. 783–790). Springer Verlag. https://doi.org/10.1007/11758525_105

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free