We present a solution to the problem of performing approximate pattern matching on compressed text. The format we choose is the Ziv-Lempel family, specifically the LZ78 and LZW variants. Given a text of length u compressed into length n, and a pattern of length m, we report all the R occurrences of the pattern in the text allowing up to k insertions, deletions and substitutions, in O(mkn+R) time. The existence problem needs O(mkn) time. We also show that the algorithm can be adapted to run in O(k2n+min(mkn;m2(mfi)k) + R) average time, where fi is the alphabet size. The experimental results show a speedup over the basic approach for moderate m and small k.
CITATION STYLE
Kärkkäinen, J., Navarro, G., & Ukkonen, E. (2000). Approximate string matching over Ziv-Lempel compressed text. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1848, pp. 195–209). Springer Verlag. https://doi.org/10.1007/3-540-45123-4_18
Mendeley helps you to discover research relevant for your work.