Domain architecture in homolog identification

N. Song; R. D. Sedgewick; D. Durand

Conference Proceedings

Domain architecture in homolog identification

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2006) 4205 LNBI 11-23

DOI: 10.1007/11864127_2

0Citations

2Readers

Get full text

Abstract

Homology identification is the first step for many genomic studies. Current methods, based on sequence comparison, can result in a substantial number of mis-assignments due to the alignment of homologous domains in otherwise unrelated sequences. Here we propose methods to detect homologs through explicit comparison of domain architecture. We developed several schemes for scoring the similarity of a pair of protein sequences by exploiting an analogy between comparing proteins using their domain content and comparing documents based on their word content. We evaluate the proposed methods using a bench-mark of fifteen sequence families of known evolutionary history. The results of these studies demonstrate the effectiveness of comparing domain architectures using these similarity measures. We also demonstrate the importance of both weighting critical domains and of compensating for proteins with large numbers of domains. © Springer-Verlag Berlin Heidelberg 2006.

Cite

CITATION STYLE

APA

Song, N., Sedgewick, R. D., & Durand, D. (2006). Domain architecture in homolog identification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4205 LNBI, pp. 11–23). Springer Verlag. https://doi.org/10.1007/11864127_2

Domain architecture in homolog identification

Abstract

Cite

Register to see more suggestions