A Database and Evaluation for Classification of RNA Molecules Using Graph Methods

0Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper, we introduce a new graph dataset based on the representation of RNA. The RNA dataset includes 3178 RNA chains which are labelled in 8 classes according to their reported biological functions. The goal of this database is to provide a platform for investigating the classification of RNA using graph-based methods. The molecules are represented by graphs representing the sequence and base-pairs of the RNA, with a number of labelling schemes using base labels and local shape. We report the results of a number of state-of-the-art graph based methods on this dataset as a baseline comparison and investigate how these methods can be used to categorise RNA molecules on their type and functions. The methods applied are Weisfeiler Lehman and optimal assignment kernels, shortest paths kernel and the all paths and cycle methods. We also compare to the standard Needleman-Wunsch algorithm used in bioinformatics for DNA and RNA comparison, and demonstrate the superiority of graph kernels even on a string representation. The highest classification rate is obtained by the WL-OA algorithm using base labels and base-pair connections.

Cite

CITATION STYLE

APA

Algul, E., & Wilson, R. C. (2019). A Database and Evaluation for Classification of RNA Molecules Using Graph Methods. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11510 LNCS, pp. 78–87). Springer Verlag. https://doi.org/10.1007/978-3-030-20081-7_8

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free