Efficient text proximity search

Ralf Schenkel; Andreas Broschart; Seungwon Hwang; Martin Theobald; Gerhard Weikum

Conference Proceedings

Efficient text proximity search

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2007) 4726 LNCS 287-299

DOI: 10.1007/978-3-540-75530-2_26

39Citations

28Readers

Get full text

Abstract

In addition to purely occurrence-based relevance models, term proximity has been frequently used to enhance retrieval quality of keyword-oriented retrieval systems. While there have been approaches on effective scoring functions that incorporate proximity, there has not been much work on algorithms or access methods for their efficient evaluation. This paper presents an efficient evaluation framework including a proximity scoring function integrated within a top-k query engine for text retrieval. We propose precomputed and materialized index structures that boost performance. The increased retrieval effectiveness and efficiency of our framework are demonstrated through extensive experiments on a very large text benchmark collection. In combination with static index pruning for the proximity lists, our algorithm achieves an improvement of two orders of magnitude compared to a term-based top-k evaluation, with a significantly improved result quality. © Springer-Verlag Berlin Heidelberg 2007.

Cite

CITATION STYLE

APA

Schenkel, R., Broschart, A., Hwang, S., Theobald, M., & Weikum, G. (2007). Efficient text proximity search. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4726 LNCS, pp. 287–299). Springer Verlag. https://doi.org/10.1007/978-3-540-75530-2_26

Efficient text proximity search

Abstract

Cite

Register to see more suggestions