PDP-RF: Protein domain boundary prediction using random forest classifier

2Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

The Domain Boundary Prediction is a crucial task for functional classification of proteins, homology-based protein structure prediction and for high-throughput structural genomics. Each amino acid is represented using a set of physico-chemical properties. Random Forest Classifier is explored for accurate prediction of domain regions by training on the curated dataset obtained from CATH database. The software is tested on proteins of CASP-6, CASP-8, CASP-9 and CASP-10 targets in order to evaluate its prediction accuracy using three fold cross validation experiments. Finally, a consensus approach is used to combine results of the classifiers obtained through the cross-validation experiments. The average recall and precision scores achieved by the developed consensus based Random Forest classifiers (PDP-RF) are 0.98 and 0.88 respectively for prediction of CASP targets. The overall accuracy and F-scores of the PDP-RF are observed as 0.87 and 0.91 respectively.

Cite

CITATION STYLE

APA

Chatterjee, P., Basu, S., Zubek, J., Kundu, M., Nasipuri, M., & Plewczynski, D. (2015). PDP-RF: Protein domain boundary prediction using random forest classifier. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9124, pp. 441–450). Springer Verlag. https://doi.org/10.1007/978-3-319-19941-2_42

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free