Matataki: An ultrafast mRNA quantification method for large-scale reanalysis of RNA-Seq data

7Citations
Citations of this article
28Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Background: Data generated by RNA sequencing (RNA-Seq) is now accumulating in vast amounts in public repositories, especially for human and mouse genomes. Reanalyzing these data has emerged as a promising approach to identify gene modules or pathways. Although meta-analyses of gene expression data are frequently performed using microarray data, meta-analyses using RNA-Seq data are still rare. This lag is partly due to the limitations in reanalyzing RNA-Seq data, which requires extensive computational resources. Moreover, it is nearly impossible to calculate the gene expression levels of all samples in a public repository using currently available methods. Here, we propose a novel method, Matataki, for rapidly estimating gene expression levels from RNA-Seq data. Results: The proposed method uses k-mers that are unique to each gene for the mapping of fragments to genes. Since aligning fragments to reference sequences requires high computational costs, our method could reduce the calculation cost by focusing on k-mers that are unique to each gene and by skipping uninformative regions. Indeed, Matataki outperformed conventional methods with regards to speed while demonstrating sufficient accuracy. Conclusions: The development of Matataki can overcome current limitations in reanalyzing RNA-Seq data toward improving the potential for discovering genes and pathways associated with disease at reduced computational cost. Thus, the main bottleneck of RNA-Seq analyses has shifted to achieving the decompression of sequenced data. The implementation of Matataki is available at https://github.com/informationsea/Matataki.

Author supplied keywords

Cite

CITATION STYLE

APA

Okamura, Y., & Kinoshita, K. (2018). Matataki: An ultrafast mRNA quantification method for large-scale reanalysis of RNA-Seq data. BMC Bioinformatics, 19(1). https://doi.org/10.1186/s12859-018-2279-y

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free