Speeding up q-gram mining on grammar-based compressed texts

N/ACitations
Citations of this article
10Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We present an efficient algorithm for calculating q-gram frequencies on strings represented in compressed form, namely, as a straight line program (SLP). Given an SLP of size n that represents string T, the algorithm computes the occurrence frequencies of all q-grams in T, by reducing the problem to the weighted q-gram frequencies problem on a trie-like structure of size , where is a quantity that represents the amount of redundancy that the SLP captures with respect to q-grams. The reduced problem can be solved in linear time. Since m = O(qn), the running time of our algorithm is , improving our previous O(qn) algorithm when q = Ω(|T|/n). © 2012 Springer-Verlag.

Cite

CITATION STYLE

APA

Goto, K., Bannai, H., Inenaga, S., & Takeda, M. (2012). Speeding up q-gram mining on grammar-based compressed texts. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7354 LNCS, pp. 220–231). https://doi.org/10.1007/978-3-642-31265-6_18

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free