Identifying lncRNA based on support vector machine

1Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

With the development of high-throughput sequencing technology, it brings a large volume of data of transcriptome. Long non-protein-coding RNAs (lncRNAs) identification is pervasive in transcriptome studies in their important roles in biological process. This paper proposed a computational method for identifying lncRNAs based on machine learning. The method first selects feature using k-mer for traversing the transcript sequence to obtain a large class of features, integrated GC content and sequence length. Then it uses variance test to select three kinds of features by grid searching and reduce the data dimension and support vector machine pressure to establish a recognition model, the final model has a certain stability and robustness. The method obtain 95.7% accuracy, 0.99 AUC for test dataset. Therefore, it could be promising for identifying lncRNA.

Cite

CITATION STYLE

APA

Li, Y., Ou, Y., Xu, Z., & Gong, L. (2019). Identifying lncRNA based on support vector machine. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11837 LNCS, pp. 68–75). Springer. https://doi.org/10.1007/978-3-030-32962-4_7

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free