A representation of a compressed de Bruijn graph for pan-genome analysis that enables search

14Citations
Citations of this article
42Readers
Mendeley users who have this article in their library.

Abstract

Background: Recently, Marcus et al. (Bioinformatics 30:3476-83, 2014) proposed to use a compressed de Bruijn graph to describe the relationship between the genomes of many individuals/strains of the same or closely related species. They devised an O(n log g) time algorithm called splitMEM that constructs this graph directly (i.e., without using the uncompressed de Bruijn graph) based on a suffix tree, where n is the total length of the genomes and g is the length of the longest genome. Baier et al. (Bioinformatics 32:497-504, 2016) improved their result. Results: In this paper, we propose a new space-efficient representation of the compressed de Bruijn graph that adds the possibility to search for a pattern (e.g. an allele-a variant form of a gene) within the pan-genome. The ability to search within the pan-genome graph is of utmost importance and is a design goal of pan-genome data structures.

Cite

CITATION STYLE

APA

Beller, T., & Ohlebusch, E. (2016). A representation of a compressed de Bruijn graph for pan-genome analysis that enables search. Algorithms for Molecular Biology, 11(1). https://doi.org/10.1186/s13015-016-0083-7

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free