Analysis of superfamily specific profile-profile recognition accuracy

7Citations
Citations of this article
11Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Background: Annotation of sequences that share little similarity to sequences of known function remains a major obstacle in genome annotation. Some of the best methods of detecting remote relationships between protein sequences are based on matching sequence profiles. We analyse the superfamily specific performance of sequence profile-profile matching. Our benchmark consists of a set of 16 protein superfamilies that are highly diverse at the sequence level. We relate the performance to the number of sequences in the profiles, the profile diversity and the extent of structural conservation in the superfamily. Results: The performance varies greatly between superfamilies with the truncated receiver operating characteristic, ROC10, varying from 0.95 down to 0.01. These large differences persist even when the profiles are trimmed to approximately the same level of diversity. Conclusions: Although the number of sequences in the profile (profile width) and degree of sequence variation within positions in the profile (profile diversity) contribute to accurate detection there are other superfamily specific factors. © 2004 Casbon and Saqi; licensee BioMed Central Ltd.

Cite

CITATION STYLE

APA

Casbon, J. A., & Saqi, M. A. S. (2004). Analysis of superfamily specific profile-profile recognition accuracy. BMC Bioinformatics, 5. https://doi.org/10.1186/1471-2105-5-200

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free