Pattern matching in Lempel-Ziv compressed strings: Fast, Simple, and deterministic

Paweł Gawrychowski

Conference Proceedings

Pattern matching in Lempel-Ziv compressed strings: Fast, Simple, and deterministic

Gawrychowski P

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2011) 6942 LNCS 421-432

DOI: 10.1007/978-3-642-23719-5_36

40Citations

25Readers

Get full text

Abstract

Countless variants of the Lempel-Ziv compression are widely used in many real-life applications. This paper is concerned with a natural modification of the classical pattern matching problem inspired by the popularity of such compression methods: given an uncompressed pattern p[1..m] and a Lempel-Ziv representation of a string t[1..N], does p occur in t? Farach and Thorup [5] gave a randomized O(n log2 N/n+m) time solution for this problem, where n is the size of the compressed representation of t. Building on the methods of [3] and [6], we improve their result by developing a faster and fully deterministic O(n log2 N/n+m)time algorithm with the same space complexity. Note that for highly compressible texts, log N/n might be of order n, so for such inputs the improvement is very significant. A small fragment of our method can be used to give an asymptotically optimal solution for the substring hashing problem considered by Farach and Muthukrishnan [4]. © 2011 Springer-Verlag Berlin Heidelberg.

Author supplied keywords

Cite

CITATION STYLE

APA

Gawrychowski, P. (2011). Pattern matching in Lempel-Ziv compressed strings: Fast, Simple, and deterministic. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6942 LNCS, pp. 421–432). https://doi.org/10.1007/978-3-642-23719-5_36

Pattern matching in Lempel-Ziv compressed strings: Fast, Simple, and deterministic

Abstract

Author supplied keywords

Cite

Register to see more suggestions