Compression of FASTQ and SAM Format Sequencing Data

147Citations
Citations of this article
164Readers
Mendeley users who have this article in their library.

Abstract

Storage and transmission of the data produced by modern DNA sequencing instruments has become a major concern, which prompted the Pistoia Alliance to pose the SequenceSqueeze contest for compression of FASTQ files. We present several compression entries from the competition, Fastqz and Samcomp/Fqzcomp, including the winning entry. These are compared against existing algorithms for both reference based compression (CRAM, Goby) and non-reference based compression (DSRC, BAM) and other recently published competition entries (Quip, SCALCE). The tools are shown to be the new Pareto frontier for FASTQ compression, offering state of the art ratios at affordable CPU costs. All programs are freely available on SourceForge. Fastqz: https://sourceforge.net/projects/fastqz/, fqzcomp: https://sourceforge.net/projects/fqzcomp/, and samcomp: https://sourceforge.net/projects/samcomp/. © 2013 Bonfield, Mahoney.

Cite

CITATION STYLE

APA

Bonfield, J. K., & Mahoney, M. V. (2013). Compression of FASTQ and SAM Format Sequencing Data. PLoS ONE, 8(3). https://doi.org/10.1371/journal.pone.0059190

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free