When less is more: Improving classification of protein families with a minimal set of global features

7Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Sequence-derived structural and physicochemical features have been used to develop models for predicting protein families. Here, we test the hypothesis that high-level functional groups of proteins may be classified by a very small set of global features directly extracted from sequence alone. To test this, we represent each protein using a small number of normalized global sequence features and classify them into functional groups, using support vector machines (SVM). Furthermore, the contribution of specific subsets of features to the classification quality is thoroughly investigated. The representation of proteins using global features provides effective information for protein family classification, with comparable results to those obtained by representation using local sequence alignment scores. Furthermore, a combination of global and local sequence features significantly improves classification performance. © Springer-Verlag Berlin Heidelberg 2007.

Cite

CITATION STYLE

APA

Varshavsky, R., Fromer, M., Man, A., & Linial, M. (2007). When less is more: Improving classification of protein families with a minimal set of global features. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4645 LNBI, pp. 12–24). Springer Verlag. https://doi.org/10.1007/978-3-540-74126-8_3

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free