Sparse and truncated suffix trees on variable-length codes

Takashi Uemura; Hiroki Arimura

Conference Proceedings

Sparse and truncated suffix trees on variable-length codes

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2011) 6661 LNCS 246-260

DOI: 10.1007/978-3-642-21458-5_22

6Citations

8Readers

Get full text

Abstract

The sparse suffix trees (SST), introduced by (Kärkkäinen and Ukkonen, COCOON 1996), is the suffix tree for a subset of all suffixes of an input text T of length n. In this paper, we study a special case that an input string is a sequence of k codewords drawn from a regular prefix code Δ ⊆ ∑+ recognized by a finite automaton, and index points locate on the code boundaries. In this case, we present an online algorithm that constructs the sparse suffix tree for an input string T on any variable-length regular prefix code, called the code suffix tree (CST), in O(n + m) time and O(k) additional space for a fixed base alphabet ∑, where m is the size of an automaton for Δ. Furthermore, we present a modified algorithm for ℓ-truncated version of code suffix trees that runs in the same time and space complexities. Hence, these results generalize the previous results (Inenaga and Takeda, CPM 2006) for word suffix trees and (Na, Apostolico, Iliopoulos, and Park, Theor. Comp. Sci., 304, 2003) for truncated suffix trees on arbitrary variable-length regular prefix codes, such as Huffman codes and multi-byte codes (e.g. UTF-8). © 2011 Springer-Verlag.

Cite

CITATION STYLE

APA

Uemura, T., & Arimura, H. (2011). Sparse and truncated suffix trees on variable-length codes. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6661 LNCS, pp. 246–260). https://doi.org/10.1007/978-3-642-21458-5_22

Sparse and truncated suffix trees on variable-length codes

Abstract

Cite

Register to see more suggestions