Structure-based prediction of protein-peptide binding regions using Random Forest

Ghazaleh Taherzadeh; Yaoqi Zhou; Alan Wee Chung Liew; Yuedong Yang

Journal ArticleOPEN ACCESS

Structure-based prediction of protein-peptide binding regions using Random Forest

Bioinformatics (2018) 34(3) 477-484

DOI: 10.1093/bioinformatics/btx614

68Citations

99Readers

Abstract

Motivation Protein-peptide interactions are one of the most important biological interactions and play crucial role in many diseases including cancer. Therefore, knowledge of these interactions provides invaluable insights into all cellular processes, functional mechanisms, and drug discovery. Protein-peptide interactions can be analyzed by studying the structures of protein-peptide complexes. However, only a small portion has known complex structures and experimental determination of protein-peptide interaction is costly and inefficient. Thus, predicting peptide-binding sites computationally will be useful to improve efficiency and cost effectiveness of experimental studies. Here, we established a machine learning method called SPRINT-Str (Structure-based prediction of protein-Peptide Residue-level Interaction) to use structural information for predicting protein-peptide binding residues. These predicted binding residues are then employed to infer the peptide-binding site by a clustering algorithm. Results SPRINT-Str achieves robust and consistent results for prediction of protein-peptide binding regions in terms of residues and sites. Matthews' Correlation Coefficient (MCC) for 10-fold cross validation and independent test set are 0.27 and 0.293, respectively, as well as 0.775 and 0.782, respectively for area under the curve. The prediction outperforms other state-of-The-Art methods, including our previously developed sequence-based method. A further spatial neighbor clustering of predicted binding residues leads to prediction of binding sites at 20-116% higher coverage than the next best method at all precision levels in the test set. The application of SPRINT-Str to protein binding with DNA, RNA and carbohydrate confirms the methodâ s capability of separating peptide-binding sites from other functional sites. More importantly, similar performance in prediction of binding residues and sites is obtained when experimentally determined structures are replaced by unbound structures or quality model structures built from homologs, indicating its wide applicability.

Cite

CITATION STYLE

APA

Taherzadeh, G., Zhou, Y., Liew, A. W. C., & Yang, Y. (2018). Structure-based prediction of protein-peptide binding regions using Random Forest. Bioinformatics, 34(3), 477–484. https://doi.org/10.1093/bioinformatics/btx614

Structure-based prediction of protein-peptide binding regions using Random Forest

Abstract

Cite

Register to see more suggestions