A Data Driven Model for Predicting RNA-Protein Interactions based on Gradient Boosting Machine

12Citations
Citations of this article
42Readers
Mendeley users who have this article in their library.

Abstract

RNA protein interactions (RPI) play a pivotal role in the regulation of various biological processes. Experimental validation of RPI has been time-consuming, paving the way for computational prediction methods. The major limiting factor of these methods has been the accuracy and confidence of the predictions, and our in-house experiments show that they fail to accurately predict RPI involving short RNA sequences such as TERRA RNA. Here, we present a data-driven model for RPI prediction using a gradient boosting classifier. Amino acids and nucleotides are classified based on the high-resolution structural data of RNA protein complexes. The minimum structural unit consisting of five residues is used as the descriptor. Comparative analysis of existing methods shows the consistently higher performance of our method irrespective of the length of RNA present in the RPI. The method has been successfully applied to map RPI networks involving both long noncoding RNA as well as TERRA RNA. The method is also shown to successfully predict RNA and protein hubs present in RPI networks of four different organisms. The robustness of this method will provide a way for predicting RPI networks of yet unknown interactions for both long noncoding RNA and microRNA.

Cite

CITATION STYLE

APA

Jain, D. S., Gupte, S. R., & Aduri, R. (2018). A Data Driven Model for Predicting RNA-Protein Interactions based on Gradient Boosting Machine. Scientific Reports, 8(1). https://doi.org/10.1038/s41598-018-27814-2

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free