Sprite: A fast parallel SNP detection pipeline

2Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We present Sprite, a new high-performance data analysis pipeline for detecting single nucleotide polymorphisms (SNPs) in the human genome. A SNP detection pipeline for next-generation sequencing data uses several software tools, including tools for read alignment, processing alignment output, and SNP identification. We target end-toend scalability and I/O efficiency in Sprite by merging tools in this pipeline and eliminating redundancies. For a benchmark human wholegenome sequencing data set, Sprite takes less than 50min on 16 nodes of the TACC Stampede supercomputer. A key component of our optimized pipeline is parsnip, a new parallel method and software tool for SNP detection. We find that the quality of results obtained by parsnip (sensitivity and precision using high-confidence variant calls as ground truth) is comparable to state-of-the-art SNP-calling software. A prototype implementation of Sprite is available at sprite-psu.sourceforge.net.

Cite

CITATION STYLE

APA

Rengasamy, V., & Madduri, K. (2016). Sprite: A fast parallel SNP detection pipeline. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9697, pp. 159–177). Springer Verlag. https://doi.org/10.1007/978-3-319-41321-1_9

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free