Multi-domain protein family classification using isomorphic inter-property relationships

0Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Multi-domain proteins result from the duplication and combination of complex but limited number of domains. The ability to distinguish multi-domain homologs from unrelated pairs that share a domain is essential to genomic analysis. Heuristics based on sequence similarity and alignment coverage have been proposed to screen out domain insertions but have met with limited success. In this paper we propose a unique protein classification schema for multi-domain protein superfamilies. Segmented profiles of physico-chemical properties and amino acid composition are created for vector quantization based dimensionality reduction to create a feature profile for rule-discovery and classification. Association rules are mined to identify isomorphic relationships that govern the formation of domains between proteins to correctly predict homologous pairs and reject unrelated pairs, including those that share domains. Our results demonstrate that effective classification of conserved domain classes can be performed using these feature profiles, and the classifier is not susceptible to class imbalances frequently encountered in these databases. © 2009 Springer Berlin Heidelberg.

Cite

CITATION STYLE

APA

Singh, H., Chowriappa, P., & Dua, S. (2009). Multi-domain protein family classification using isomorphic inter-property relationships. In Communications in Computer and Information Science (Vol. 40, pp. 473–484). https://doi.org/10.1007/978-3-642-03547-0_45

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free