Machine learning tools are employed to establish relationship between the characteristics of protein-ligand binding site and enzyme class. Enzyme classification is a challenging problem from data mining perspective due to (i) class imbalance problem and (ii) appropriate feature selection. We address the problem by choosing novel features from protein binding site. Protein Ligand Interaction Database (PLID), which gives a comprehensive view of binding sites in a protein along with other contact information, is updated and presented here as PLID v1.1 . The database facilitates the study of protein-ligand interaction. Novel features due to protein ligand interaction including the chemical compound features as well as fraction of contact and tightness are investigated for classification task. The weighted classification accuracy for the data set with binding site residues as features is found to be 56% using a Random Forest classifier. It may be concluded that either the binding site features are not adequately representing the enzyme class information or the problem is caused due to the class imbalance. This problem needs further investigation. © 2011 Springer-Verlag.
CITATION STYLE
Reddy, B. R., Rani, T. S., Bhavani, S. D., Bapi, R. S., & Sastry, G. N. (2011). Correlating binding site residues of the protein and ligand features to its functionality. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7077 LNCS, pp. 166–173). https://doi.org/10.1007/978-3-642-27242-4_20
Mendeley helps you to discover research relevant for your work.