The Lempel-Ziv (LZ) 77 factorization of a string is a widely-used algorithmic tool that plays a central role in compression and indexing. For a length-n string over a linearly-sortable alphabet, e.g., $$\varSigma = \{1, \dots, \sigma \}$$ with $${\sigma = n^{\mathcal O(1)}}$$, it can be computed in $$\mathcal O(n)$$ time. It is unknown whether this time can be achieved for the rightmost LZ parsing, where each referencing phrase points to its rightmost previous occurrence. The currently best solution takes $${\mathcal O(n (1 + {\log \sigma }/{\sqrt{\log n})})}$$ time (Belazzougui & Puglisi SODA2016). We show that this problem is much easier to solve for the LZ-End factorization (Kreft & Navarro DCC2010), where the rightmost factorization can be obtained in $$\mathcal O(n)$$ time for the greedy parsing (with phrases of maximal length), and in $$\mathcal O(n + z \sqrt{\log z})$$ time for any LZ-End parsing of z phrases. We also make advances towards a linear time solution for the general case. We show how to solve multiple non-trivial subsets of the phrases of any LZ-like parsing in $$\mathcal O(n)$$ time. As a prime example, we can find the rightmost occurrence of all phrases of length $$\varOmega (\log ^{6.66} n / \log ^2 \sigma )$$ in $$\mathcal O(n / \log _\sigma n)$$ time and space.
CITATION STYLE
Ellert, J., Fischer, J., & Pedersen, M. R. (2023). New Advances in Rightmost Lempel-Ziv. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 14240 LNCS, pp. 188–202). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-43980-3_15
Mendeley helps you to discover research relevant for your work.