Dotted suffix trees a structure for approximate text indexing

Luís Pedro Coelho; Arlindo L. Oliveira

Conference Proceedings

Dotted suffix trees a structure for approximate text indexing

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2006) 4209 LNCS 329-336

DOI: 10.1007/11880561_27

6Citations

10Readers

Get full text

Abstract

In this work, the problem we address is text indexing for approximate matching. Given a text τ which undergoes some preprocessing to generate an index, we can later query this index to identify the places where a string occurs up to a certain number of errors k (edition distance). The indexing structure occupies space O(n logk n) in the average case, independent of alphabet size. This structure can be used to report the existence of a match with k errors in O(3kmk+1) and to report the occurrences in O(3kmk+1 + ed) time, where m is the length of the pattern and ed and the number of matching edit scripts. The construction of the structure has time bound by O(kN|Σ|), where N is the number of nodes in the index and |Σ| the alphabet size. © Springer-Verlag Berlin Heidelberg 2006.

Author supplied keywords

Cite

CITATION STYLE

APA

Coelho, L. P., & Oliveira, A. L. (2006). Dotted suffix trees a structure for approximate text indexing. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4209 LNCS, pp. 329–336). Springer Verlag. https://doi.org/10.1007/11880561_27

Dotted suffix trees a structure for approximate text indexing

Abstract

Author supplied keywords

Cite

Register to see more suggestions