LFQC: A lossless compression algorithm for FASTQ files

45Citations
Citations of this article
63Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Motivation: Next Generation Sequencing (NGS) technologies have revolutionized genomic research by reducing the cost of whole genome sequencing. One of the biggest challenges posed by modern sequencing technology is economic storage of NGS data. Storing raw data is infeasible because of its enormous size and high redundancy. In this article, we address the problem of storage and transmission of large FASTQ files using innovative compression techniques. Results: We introduce a new lossless non-reference based FASTQ compression algorithm named Lossless FASTQ Compressor. We have compared our algorithm with other state of the art big data compression algorithms namely gzip, bzip2, fastqz (Bonfield and Mahoney, 2013), fqzcomp (Bonfield and Mahoney, 2013), Quip (Jones et al., 2012), DSRC2 (Roguski and Deorowicz, 2014). This comparison reveals that our algorithm achieves better compression ratios on LS454 and SOLiD datasets.

Cite

CITATION STYLE

APA

Nicolae, M., Pathak, S., & Rajasekaran, S. (2015). LFQC: A lossless compression algorithm for FASTQ files. Bioinformatics, 31(20), 3276–3281. https://doi.org/10.1093/bioinformatics/btv384

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free