In a seminal paper of Charikar et al. on the smallest grammar problem, the authors derive upper and lower bounds on the approximation ratios for several grammar-based compressors, but in all cases there is a gap between the lower and upper bound. Here we close the gaps for LZ78 and BISECTION by showing that the approximation ratio of LZ78 is Θ((n/ log n)2/3), whereas the approximation ratio of BISECTION is Θ((n/ log n)1/2). We also derive a lower bound for a smallest grammar for a word in terms of its number of LZ77-factors, which refines existing bounds of Rytter. Finally, we improve results of Arpe and Reischuk relating grammar-based compression for arbitrary alphabets and binary alphabets.
CITATION STYLE
Hucke, D., Lohrey, M., & Reh, C. P. (2016). The smallest grammar problem revisited. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9954 LNCS, pp. 35–49). Springer Verlag. https://doi.org/10.1007/978-3-319-46049-9_4
Mendeley helps you to discover research relevant for your work.