Given a string S of length N on a fixed alphabet of σ symbols, a grammar compressor produces a context-free grammar G of size n that generates S and only S. In this paper we describe data structures to support the following operations on a grammar-compressed string: access(S, i, j) (return substring S[i, j]), rankc(S, i) (return the number of occurrences of symbol c before position i in S), and selectc(S, i) (return the position of the ith occurrence of c in S). Our main result for access is a method that requires O(n logN) bits of space and O(logN + m/logσ N) time to extract m = j − i + 1 consecutive symbols from S. Alternatively, we can achieve O(logτ N +m/logσ N) query time using O(nτ logτ (N/n) logN) bits of space, matching a lower bound stated by Verbin and Yu for strings where N is polynomially related to n when τ = logε N. For rank and select we describe data structures of size O(nσ logN) bits that support the two operations in O(logN) time. We also extend our other structure to support both operations in O(logτ N) time using O(nτσ logτ (N/n) logN) bits of space. When τ = logε N the query time is O(log N/ log logN) and we provide a hardness result showing that significantly improving this would imply a major breakthrough on a hard graph-theoretical problem.
CITATION STYLE
Belazzougui, D., Cording, P. H., Puglisi, S. J., & Tabei, Y. (2015). Access, rank, and select in grammar-compressed strings. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9294, pp. 142–154). Springer Verlag. https://doi.org/10.1007/978-3-662-48350-3_13
Mendeley helps you to discover research relevant for your work.