Improving accuracy of protein-protein interaction prediction by considering the converse problem for sequence representation

9Citations
Citations of this article
41Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Background: With the development of genome-sequencing technologies, protein sequences are readily obtained by translating the measured mRNAs. Therefore predicting protein-protein interactions from the sequences is of great demand. The reason lies in the fact that identifying protein-protein interactions is becoming a bottleneck for eventually understanding the functions of proteins, especially for those organisms barely characterized. Although a few methods have been proposed, the converse problem, if the features used extract sufficient and unbiased information from protein sequences, is almost untouched.Results: In this study, we interrogate this problem theoretically by an optimization scheme. Motivated by the theoretical investigation, we find novel encoding methods for both protein sequences and protein pairs. Our new methods exploit sufficiently the information of protein sequences and reduce artificial bias and computational cost. Thus, it significantly outperforms the available methods regarding sensitivity, specificity, precision, and recall with cross-validation evaluation and reaches ~80% and ~90% accuracy in Escherichia coli and Saccharomyces cerevisiae respectively. Our findings here hold important implication for other sequence-based prediction tasks because representation of biological sequence is always the first step in computational biology.Conclusions: By considering the converse problem, we propose new representation methods for both protein sequences and protein pairs. The results show that our method significantly improves the accuracy of protein-protein interaction predictions. © 2011 Ren et al; licensee BioMed Central Ltd.

Cite

CITATION STYLE

APA

Ren, X., Wang, Y. C., Wang, Y., Zhang, X. S., & Deng, N. Y. (2011). Improving accuracy of protein-protein interaction prediction by considering the converse problem for sequence representation. BMC Bioinformatics, 12. https://doi.org/10.1186/1471-2105-12-409

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free