Protein tertiary structure is indispensible in revealing the biological functions of proteins. De novo perdition of protein tertiary structure is dependent on protein fold recognition. This study proposes a novel method for prediction of protein fold types which takes pri-mary sequence as input. The proposed method, PFP-RFSM, employs a random forest classifier and a com-prehensive feature representation, including both se-quence and predicted structure descriptors. Particu-larly, we propose a method for generation of features based on sequence motifs and those features are firstly employed in protein fold prediction. PFP-RFSM and ten representative protein fold predictors are validated in a benchmark dataset consisting of 27 fold types. Experiments demonstrate that PFP-RFSM outperforms all existing protein fold predictors and improves the success rates by 2% -14%. The results suggest sequence motifs are effective in classification and analysis of protein sequences.
CITATION STYLE
Li, J., Wu, J., & Chen, K. (2013). PFP-RFSM: Protein fold prediction by using random forests and sequence motifs. Journal of Biomedical Science and Engineering, 06(12), 1161–1170. https://doi.org/10.4236/jbise.2013.612145
Mendeley helps you to discover research relevant for your work.