MicroRNAs (miRNAs) are short (∼22 nucleotides), endogenously-initiated non-coding RNAs that control gene expression post transcriptionally, either by the degradation of target miRNAs or by the inhibition of protein translation. The prediction of miRNA genes is a challenging problem towards the understanding of post transcriptional gene regulation. The present paper focuses on developing a computational method for the identification of miRNA precursors. We propose a machine learning algorithm based on Random Forests (RF) for miRNA prediction. The prediction algorithm relies on a set of features; compiled from known features as well as others introduced for the first time; that results in a performance that is better than most well known miRNA classifiers. The method achieves 91.3% accuracy, 86% fmeasure, 97.2% specificity, 93.4% precision and 79.6% sensitivity, when tested on real data. Our method succeeds in getting better results than MiPred (the best currently known RF algorithm in literature), Triplet- SVM and Virgo and EumiR. The obtained results indicate that Random Forests is a better alternative to Support VectorMachines (SVM) for miRNA prediction, especially from the point of view of accuracy and f-measure metrics.
CITATION STYLE
Elgokhy, S. M., Shibuya, T., & Shoukry, A. (2014). Improving miRNA classification using an exhaustive set of features. In Advances in Intelligent Systems and Computing (Vol. 294, pp. 31–39). Springer Verlag. https://doi.org/10.1007/978-3-319-07581-5_4
Mendeley helps you to discover research relevant for your work.