Semi-supervised prediction of protein interaction sites from unlabeled sample information

8Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Background: The recognition of protein interaction sites is of great significance in many biological processes, signaling pathways and drug designs. However, most sites on protein sequences cannot be defined as interface or non-interface sites because only a small part of protein interactions had been identified, which will cause the lack of prediction accuracy and generalization ability of predictors in protein interaction sites prediction. Therefore, it is necessary to effectively improve prediction performance of protein interaction sites using large amounts of unlabeled data together with small amounts of labeled data and background knowledge today. Results: In this work, three semi-supervised support vector machine-based methods are proposed to improve the performance in the protein interaction sites prediction, in which the information of unlabeled protein sites can be involved. Herein, five features related with the evolutionary conservation of amino acids are extracted from HSSP database and Consurf Sever, i.e., residue spatial sequence spectrum, residue sequence information entropy and relative entropy, residue sequence conserved weight and residual Base evolution rate, to represent the residues within the protein sequence. Then three predictors are built for identifying the interface residues from protein surface using three types of semi-supervised support vector machine algorithms. Conclusion: The experimental results demonstrated that the semi-supervised approaches can effectively improve prediction performance of protein interaction sites when unlabeled information is involved into the predictors and one of them can achieve the best prediction performance, i.e., the accuracy of 70.7%, the sensitivity of 62.67% and the specificity of 78.72%, respectively. With comparison to the existing studies, the semi-supervised models show the improvement of the predication performance.

References Powered by Scopus

Basic local alignment search tool

78066Citations
N/AReaders
Get full text

Towards a proteome-scale map of the human protein-protein interaction network

2404Citations
N/AReaders
Get full text

ConSurf: Identification of functional regions in proteins by surface-mapping of phylogenetic information

1017Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Developing computational model to predict protein-protein interaction sites based on the xgboost algorithm

52Citations
N/AReaders
Get full text

SENSDeep: An Ensemble Deep Learning Method for Protein–Protein Interaction Sites Prediction

7Citations
N/AReaders
Get full text

Protein-Protein Interaction Sites Prediction Based on an Under-Sampling Strategy and Random Forest Algorithm

7Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Wang, Y., Mei, C., Zhou, Y., Wang, Y., Zheng, C., Zhen, X., … Wang, B. (2019). Semi-supervised prediction of protein interaction sites from unlabeled sample information. BMC Bioinformatics, 20. https://doi.org/10.1186/s12859-019-3274-7

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 3

43%

Professor / Associate Prof. 2

29%

Researcher 2

29%

Readers' Discipline

Tooltip

Computer Science 4

67%

Agricultural and Biological Sciences 1

17%

Biochemistry, Genetics and Molecular Bi... 1

17%

Save time finding and organizing research with Mendeley

Sign up for free