Large-scale DNA sequence analysis in the cloud: A stream-based approach

Romeo Kienzler; Rémy Bruggmann; Anand Ranganathan; Nesime Tatbul

Conference ProceedingsOPEN ACCESS

Large-scale DNA sequence analysis in the cloud: A stream-based approach

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2012) 7156 LNCS(PART 2) 467-476

DOI: 10.1007/978-3-642-29740-3_52

10Citations

35Readers

Get full text

Abstract

Cloud computing technologies have made it possible to analyze big data sets in scalable and cost-effective ways. DNA sequence analysis, where very large data sets are now generated at reduced cost using the Next-Generation Sequencing (NGS) methods, is an area which can greatly benefit from cloud-based infrastructures. Although existing solutions show nearly linear scalability, they pose significant limitations in terms of data transfer latencies and cloud storage costs. In this paper, we propose to tackle the performance problems that arise from having to transfer large amounts of data between clients and the cloud based on a streaming data management architecture. Our approach provides an incremental data processing model which can hide data transfer latencies while maintaining linear scalability. We present an initial implementation and evaluation of this approach for SHRiMP, a well-known software package for NGS read alignment, based on the IBM InfoSphere Streams computing platform deployed on Amazon EC2. © 2012 Springer-Verlag Berlin Heidelberg.

Author supplied keywords

Cite

CITATION STYLE

APA

Kienzler, R., Bruggmann, R., Ranganathan, A., & Tatbul, N. (2012). Large-scale DNA sequence analysis in the cloud: A stream-based approach. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7156 LNCS, pp. 467–476). Springer Verlag. https://doi.org/10.1007/978-3-642-29740-3_52

Large-scale DNA sequence analysis in the cloud: A stream-based approach

Abstract

Author supplied keywords

Cite

Register to see more suggestions