Space-efficient data structures for flexible text retrieval systems

Kunihiko Sadakane

Conference Proceedings

Space-efficient data structures for flexible text retrieval systems

Sadakane K

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2002) 2518 LNCS 14-24

DOI: 10.1007/3-540-36136-7_2

6Citations

9Readers

Get full text

Abstract

We propose space-efficient data structures for text retrieval systems that have merits of both theoretical data structures like suffix trees and practical ones like inverted files. Traditional text retrieval systems use the inverted files and support ranking queries based on the tf*idf (term frequency times inverse document frequency) scores of documents that contain given keywords, which cannot be solved by using only the suffix trees. A drawback of the systems is that the scores can be computed for only predetermined keywords. We extend the data structure so that the scores can be computed for any pattern efficiently while keeping the size of the data structures moderate. The size is comparable with the text size, which is an improvement from existing methods using O(nlog n) bit space for a text collection of length n. © Springer-Verlag Berlin Heidelberg 2002.

Cite

CITATION STYLE

APA

Sadakane, K. (2002). Space-efficient data structures for flexible text retrieval systems. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2518 LNCS, pp. 14–24). https://doi.org/10.1007/3-540-36136-7_2

Space-efficient data structures for flexible text retrieval systems

Abstract

Cite

Register to see more suggestions