Performance analysis of parallel de novo genome assembly in shared memory system

0Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

De novo genome assembly is computationally intensive tasks in genome analysis, where it builds the whole genome from small fragments (reads) generated by next-generation sequencing (NGS) platform. Parallel processing is a method to reduce the time complexity. In this work, we analyze the performance of three popular de novo genome assembly tool based on de Bruijn graph i.e., Velvet, SOAPdenovo2, and ABySS in a parallel environment. Simulated and real genome datasets from several species are used in this study. We determine the performance using two criteria, including the quality of contigs produced and the parallel performance. For the quality of contigs produced, we measure the N50 size, the number of contigs, and maximum contigs length. As for the parallel performance, we measure the speedup of the use of multi-core CPU in a shared memory system. Lastly, memory usage for each tool also compared. Based on the experiment, SOAPdenovo2 have the best performance for the quality of contigs produced with highest N50 value. All assembly tool work well in the parallel environment and give the speedup significantly. SOAPdenovo2 is the best tool that gives 22 times super-linear speedup. As for memory usage, ABySS is the most efficient one.

Cite

CITATION STYLE

APA

Iryanto, S. B., Kusuma, W. A., & Sukoco, H. (2018). Performance analysis of parallel de novo genome assembly in shared memory system. In IOP Conference Series: Earth and Environmental Science (Vol. 187). Institute of Physics Publishing. https://doi.org/10.1088/1755-1315/187/1/012032

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free