PDP-RF: Protein domain boundary prediction using random forest classifier

Piyali Chatterjee; Subhadip Basu; Julian Zubek; Mahantapas Kundu; Mita Nasipuri; Dariusz Plewczynski

Conference ProceedingsOPEN ACCESS

PDP-RF: Protein domain boundary prediction using random forest classifier

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2015) 9124 441-450

DOI: 10.1007/978-3-319-19941-2_42

2Citations

9Readers

Abstract

The Domain Boundary Prediction is a crucial task for functional classification of proteins, homology-based protein structure prediction and for high-throughput structural genomics. Each amino acid is represented using a set of physico-chemical properties. Random Forest Classifier is explored for accurate prediction of domain regions by training on the curated dataset obtained from CATH database. The software is tested on proteins of CASP-6, CASP-8, CASP-9 and CASP-10 targets in order to evaluate its prediction accuracy using three fold cross validation experiments. Finally, a consensus approach is used to combine results of the classifiers obtained through the cross-validation experiments. The average recall and precision scores achieved by the developed consensus based Random Forest classifiers (PDP-RF) are 0.98 and 0.88 respectively for prediction of CASP targets. The overall accuracy and F-scores of the PDP-RF are observed as 0.87 and 0.91 respectively.

Cite

CITATION STYLE

APA

Chatterjee, P., Basu, S., Zubek, J., Kundu, M., Nasipuri, M., & Plewczynski, D. (2015). PDP-RF: Protein domain boundary prediction using random forest classifier. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9124, pp. 441–450). Springer Verlag. https://doi.org/10.1007/978-3-319-19941-2_42

PDP-RF: Protein domain boundary prediction using random forest classifier

Abstract

Cite

Register to see more suggestions