A combined approach for de novo DNA sequence assembly of very short reads

3Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.

Abstract

De novo DNA sequence assembly is very important in genome sequence analysis. In this paper, we investigated two of the major approaches for de novo DNA sequence assembly of very short reads: overlap-layout-consensus (OLC) and Eulerian path. From that investigation, we developed a new assembly technique by combining the OLC and the Eulerian path methods in a hierarchical process. The contigs yielded by these two approaches were treated as reads and were assembled again to yield longer contigs. We tested our approach using three real very-short-read datasets generated by an Illumina Genome Analyzer and four simulated very-short-read datasets that contained sequencing errors. The sequencing errors were modeled based on Illumina's sequencing technology. As a result, our combined approach yielded longer contigs than those of Edena (OLC) and Velvet (Eulerian path) in various coverage depths and was comparable to SOAPdenovo, in terms of N50 size and maximum contig lengths. The assembly results were also validated by comparing contigs that were produced by assemblers with their reference sequence from an NCBI database. The results show that our approach produces more accurate results than Velvet, Edena, and SOAPdenovo alone. This comparison indicates that our approach is a viable way to assemble very short reads from next generation sequencers. © 2011 Information Processing Society of Japan.

Cite

CITATION STYLE

APA

Kusuma, W. A., Ishida, T., & Akiyama, Y. (2011). A combined approach for de novo DNA sequence assembly of very short reads. IPSJ Transactions on Bioinformatics, 4, 21–33. https://doi.org/10.2197/ipsjtbio.4.21

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free