A topological approach for protein classification

100Citations
Citations of this article
72Readers
Mendeley users who have this article in their library.

Abstract

Protein function and dynamics are closely related to its sequence and structure.However, prediction of protein function and dynamics from its sequence and structure is still a fundamental challenge in molecular biology. Protein classification, which is typically done through measuring the similarity between proteins based on protein sequence or physical information, serves as a crucial step toward the understanding of protein function and dynamics. Persistent homology is a new branch of algebraic topology that has found its success in the topological data analysis in a variety of disciplines, including molecular biology. The present work explores the potential of using persistent homology as an independent tool for protein classification. To this end, we propose a molecular topological fingerprint based support vector machine (MTF-SVM) classifier. Specifically,we construct machine learning feature vectors solely fromprotein topological fingerprints,which are topological invariants generated during the filtration process. To validate the presentMTF-SVMapproach, we consider four types of problems. First, we study protein-drug binding by using the M2 channel protein of influenza A virus. We achieve 96% accuracy in discriminating drug bound and unbound M2 channels. Secondly, we examine the use of MTF-SVM for the classification of hemoglobin molecules in their relaxed and taut forms and obtain about 80% accuracy. Thirdly, the identification of all alpha, all beta, and alpha-beta protein domains is carried out using 900 proteins.We have found a 85% success in this identification. Finally, we apply the present technique to 55 classification tasks of protein superfamilies over 1357 samples and 246 tasks over 11944 samples. Average accuracies of 82% and 73% are attained. The present study establishes computational topology as an independent and effective alternative for protein classification.

Cite

CITATION STYLE

APA

Cang, Z., Mu, L., Wu, K., Opron, K., Xia, K., & Wei, G. W. (2015). A topological approach for protein classification. Computational and Mathematical Biophysics, 3(1), 140–162. https://doi.org/10.1515/mlbmb-2015-0009

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free