Fragmented BWT: An extended BWT for full-text indexing

2Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper proposes Fragmented Burrows Wheeler Transform (FBWT), an extension to the well-known BWT structure for fulltext indexing and searching. A FBWT consists of a number of BWT fragments each covering only a subset of all the suffixes of the original string. As constructing FBWT does not entail building the BWT of the whole string, it is faster than constructing BWT. On the other hand, searching with FBWT can be more costly than that with BWT, since searching the former requires searching all fragments; its amount of work is O(dp+occ log1+ϵ n) as opposed to O(p+occ log1+ϵ n) of regular BWT, where p is the length of the query string, n the length of the original text, occ the occurrences of the query string, and d the number of fragments. To compensate the search cost, searching with FBWT can be accelerated with SIMD instructions by searching multiple fragments in parallel. Experiments show that building FBWT is about twice as fast as building BWT via a state of the art algorithm (SA-IS); and that FBWT’s search performance compared to BWT’s depends on the number of occurrences, ranging from four times slower than BWT (when there are few occurrences), to twice as fast as BWT (when there are many).

Cite

CITATION STYLE

APA

Ito, M., Inoue, H., & Taura, K. (2016). Fragmented BWT: An extended BWT for full-text indexing. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9954 LNCS, pp. 97–109). Springer Verlag. https://doi.org/10.1007/978-3-319-46049-9_10

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free