Practical and flexible pattern matching over Ziv-Lempel compressed text

13Citations
Citations of this article
15Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

We address the problem of string matching on Ziv-Lempel compressed text. The goal is to search for a pattern in a text without uncompressing it. This is a highly relevant issue to keep compressed text databases where efficient searching is still possible. We develop a general technique for string matching when the text comes as a sequence of blocks. This abstracts the essential features of Ziv-Lempel compression. We then apply the scheme to each particular type of compression. We present an algorithm to find all the matches of a pattern in a text compressed using LZ77. When we apply our scheme to LZ78, we obtain a much more efficient search algorithm, which is faster than uncompressing the text and then searching it. Finally, we propose a new hybrid compression scheme which is between LZ77 and LZ78, being in practice as good to compress as LZ77 and as fast to search as LZ78. We show also how to search for some extended patterns on Ziv-Lempel compressed text, such as classes of characters and approximate string matching. © 2003 Elsevier B.V. All rights reserved.

Cite

CITATION STYLE

APA

Navarro, G., & Raffinot, M. (2004). Practical and flexible pattern matching over Ziv-Lempel compressed text. Journal of Discrete Algorithms, 2(3), 347–371. https://doi.org/10.1016/j.jda.2003.12.002

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free