Enabling Genomics Pipelines in Commodity Personal Computers With Flash Storage

0Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.

Abstract

Analysis of a patient's genomics data is the first step toward precision medicine. Such analyses are performed on expensive enterprise-class server machines because input data sets are large, and the intermediate data structures are even larger (TB-size) and require random accesses. We present a general method to perform a specific genomics problem, mutation detection, on a cheap commodity personal computer (PC) with a small amount of DRAM. We construct and access large histograms of k-mers efficiently on external storage (SSDs) and apply our technique to a state-of-the-art reference-free genomics algorithm, SMUFIN, to create SMUFIN-F. We show that on two PCs, SMUFIN-F can achieve the same throughput at only one third (36%) the hardware cost and half (45%) the energy compared to SMUFIN on an enterprise-class server. To the best of our knowledge, SMUFIN-F is the first reference-free system that can detect somatic mutations on commodity PCs for whole human genomes. We believe our technique should apply to other k-mer or n-gram-based algorithms.

Cite

CITATION STYLE

APA

Cadenelli, N., Jun, S. W., Polo, J., Wright, A., Carrera, D., & Arvind. (2021). Enabling Genomics Pipelines in Commodity Personal Computers With Flash Storage. Frontiers in Genetics, 12. https://doi.org/10.3389/fgene.2021.615958

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free