Motivation MicroRNAs (miRNAs) are endogenous non-coding small RNAs (of about 22 nucleotides), which play an important role in the post-Transcriptional regulation of gene expression via either mRNA cleavage or translation inhibition. Several machine learning-based approaches have been developed to identify novel miRNAs from next generation sequencing (NGS) data. Typically, precursor/genomic sequences are required as references for most methods. However, the non-Availability of genomic sequences is often a limitation in miRNA discovery in non-model plants. A systematic approach to determine novel miRNAs without reference sequences is thus necessary. Results In this study, an effective method was developed to identify miRNAs from non-model plants based only on NGS datasets. The miRNA prediction model was trained with several duplex structure-related features of mature miRNAs and their passenger strands using a support vector machine algorithm. The accuracy of the independent test reached 96.61% and 93.04% for dicots (Arabidopsis) and monocots (rice), respectively. Furthermore, true small RNA sequencing data from orchids was tested in this study. Twenty-one predicted orchid miRNAs were selected and experimentally validated. Significantly, 18 of them were confirmed in the qRT-PCR experiment. This novel approach was also compiled as a user-friendly program called microRPM (miRNA Prediction Model). Availability and implementation This resource is freely available at http://microRPM.itps.ncku.edu.tw. Contact nslin@sinica.edu.tw or sarah321@mail.ncku.edu.tw Supplementary informationSupplementary dataare available at Bioinformatics online.
CITATION STYLE
Tseng, K. C., Chiang-Hsieh, Y. F., Pai, H., Chow, C. N., Lee, S. C., Zheng, H. Q., … Chang, W. C. (2018). MicroRPM: A microRNA prediction model based only on plant small RNA sequencing data. Bioinformatics, 34(7), 1108–1115. https://doi.org/10.1093/bioinformatics/btx725
Mendeley helps you to discover research relevant for your work.