High-throughput DNA sequence data compression

66Citations
Citations of this article
99Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

The exponential growth of high-throughput DNA sequence data has posed great challenges to genomic data storage, retrieval and transmission. Compression is a critical tool to address these challenges, where many methods have been developed to reduce the storage size of the genomes and sequencing data (reads, quality scores andmetadata). However, genomic data are being generated faster than they could be meaningfully analyzed, leaving a large scope for developing novel compression algorithms that could directly facilitate data analysis beyond data transfer and storage. In this article, we categorize and provide a comprehensive review of the existing compression methods specialized for genomic data and present experimental results on compression ratio, memory usage, time for compression and decompression. We further present the remaining challenges and potential directions for future research.

Cite

CITATION STYLE

APA

Zhu, Z., Zhang, Y., Ji, Z., He, S., & Yang, X. (2013). High-throughput DNA sequence data compression. Briefings in Bioinformatics, 16(1), 1–15. https://doi.org/10.1093/bib/bbt087

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free