Approximate string matching over Ziv-Lempel compressed text

30Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We present a solution to the problem of performing approximate pattern matching on compressed text. The format we choose is the Ziv-Lempel family, specifically the LZ78 and LZW variants. Given a text of length u compressed into length n, and a pattern of length m, we report all the R occurrences of the pattern in the text allowing up to k insertions, deletions and substitutions, in O(mkn+R) time. The existence problem needs O(mkn) time. We also show that the algorithm can be adapted to run in O(k2n+min(mkn;m2(mfi)k) + R) average time, where fi is the alphabet size. The experimental results show a speedup over the basic approach for moderate m and small k.

Cite

CITATION STYLE

APA

Kärkkäinen, J., Navarro, G., & Ukkonen, E. (2000). Approximate string matching over Ziv-Lempel compressed text. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1848, pp. 195–209). Springer Verlag. https://doi.org/10.1007/3-540-45123-4_18

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free