ScaleQC: A scalable lossy to lossless solution for NGS data compression

7Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Motivation: Per-base quality values in Next Generation Sequencing data take a significant portion of storage even after compression. Lossy compression technologies could further reduce the space used by quality values. However, in many applications, lossless compression is still desired. Hence, sequencing data in multiple file formats have to be prepared for different applications. Results: We developed a scalable lossy to lossless compression solution for quality values named ScaleQC (Scalable Quality value Compression). ScaleQC is able to provide the so-called bit-stream level scalability that the losslessly compressed bit-stream by ScaleQC can be further truncated to lower data rates without incurring an expensive transcoding operation. Despite its scalability, ScaleQC still achieves comparable compression performance at both lossless and lossy data rates compared to the existing lossless or lossy compressors. Availability and implementation: ScaleQC has been integrated with SAMtools as a special quality value encoding mode for CRAM. Its source codes can be obtained from our integrated SAMtools (https://github.com/xmuyulab/sam tools) with dependency on integrated HTSlib (https://github.com/xmuyulab/htslib).

Cite

CITATION STYLE

APA

Yu, R., & Yang, W. (2020). ScaleQC: A scalable lossy to lossless solution for NGS data compression. Bioinformatics, 36(17), 4551–4559. https://doi.org/10.1093/bioinformatics/btaa543

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free