XSI - a genotype compression tool for compressive genomics in large biobanks

8Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.

Abstract

Motivation: Generation of genotype data has been growing exponentially over the last decade. With the large size of recent datasets comes a storage and computational burden with ever increasing costs. To reduce this burden, we propose XSI, a file format with reduced storage footprint that also allows computation on the compressed data and we show how this can improve future analyses. Results: We show that xSqueezeIt (XSI) allows for a file size reduction of 4-20× compared with compressed BCF and demonstrate its potential for 'compressive genomics' on the UK Biobank whole-genome sequencing genotypes with 8× faster loading times, 5× faster run of homozygozity computation, 30× faster dot products computation and 280× faster allele counts.

Cite

CITATION STYLE

APA

Wertenbroek, R., Rubinacci, S., Xenarios, I., Thoma, Y., & Delaneau, O. (2022). XSI - a genotype compression tool for compressive genomics in large biobanks. Bioinformatics, 38(15), 3778–3784. https://doi.org/10.1093/bioinformatics/btac413

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free