Prediction of RNA-protein interactions using a nucleotide language model

25Citations
Citations of this article
68Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Motivation: The accumulation of sequencing data has enabled researchers to predict the interactions between RNA sequences and RNA-binding proteins (RBPs) using novel machine learning techniques. However, existing models are often difficult to interpret and require additional information to sequences. Bidirectional encoder representations from transformer (BERT) is a language-based deep learning model that is highly interpretable. Therefore, a model based on BERT architecture can potentially overcome such limitations. Results: Here, we propose BERT-RBP as a model to predict RNA-RBP interactions by adapting the BERT architecture pretrained on a human reference genome. Our model outperformed state-of-the-art prediction models using the eCLIP-seq data of 154 RBPs. The detailed analysis further revealed that BERT-RBP could recognize both the transcript region type and RNA secondary structure only based on sequence information. Overall, the results provide insights into the fine-tuning mechanism of BERT in biological contexts and provide evidence of the applicability of the model to other RNA-related problems.

Cite

CITATION STYLE

APA

Yamada, K., & Hamada, M. (2022). Prediction of RNA-protein interactions using a nucleotide language model. Bioinformatics Advances, 2(1). https://doi.org/10.1093/bioadv/vbac023

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free