Motivation: A new representation for protein secondary structure prediction based on frequent amino acid patterns is described and evaluated. We discuss in detail how to identify frequent patterns in a protein sequence database using a level-wise search technique, how to define a set of features from those patterns and how to use those features in the prediction of the secondary structure of a protein sequence using support vector machines (SVMs). Results: Three different sets of features based on frequent patterns are evaluated in a blind testing setup using 150 targets from the EVA contest and compared to predictions of PSI-PRED, PHD and PROFsec. Despite being trained on only 940 proteins, a simple SVM classifier based on this new representation yields results comparable to PSI-PRED and PROFsec. Finally, we show that the method contributes significant information to consensus predictions. © 2006 Oxford University Press.
CITATION STYLE
Birzele, F., & Kramer, S. (2006). A new representation for protein secondary structure prediction based on frequent patterns. Bioinformatics, 22(21), 2628–2634. https://doi.org/10.1093/bioinformatics/btl453
Mendeley helps you to discover research relevant for your work.