Weighting query terms based on distributional statistics

Jussi Karlgren; Magnus Sahlgren; Rickard Cöster

Conference Proceedings

Weighting query terms based on distributional statistics

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2006) 4022 LNCS 208-211

DOI: 10.1007/11878773_24

0Citations

5Readers

Get full text

Abstract

This year, the SICS team has concentrated on query processing and on the internal topical structure of the query, specifically compound translation. Compound translation is non-trivial due to dependencies between compound elements. This year, we have investigated topical dependencies between query terms: if a query term happens to be non-topical or noise, it should be discarded or given a low weight when ranking retrieved documents; if a query term shows high topicality its weight should be boosted. The two experiments described here are based on the analysis of the distributional character of query terms: one using similarity of occurrence context between query terms globally across the entire collection; the other using the likelihood of individual terms to appear topically in individual texts. Both - complementary - boosting schemes tested delivered improved results. © Springer-Verlag Berlin Heidelberg 2006.

Cite

CITATION STYLE

APA

Karlgren, J., Sahlgren, M., & Cöster, R. (2006). Weighting query terms based on distributional statistics. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4022 LNCS, pp. 208–211). Springer Verlag. https://doi.org/10.1007/11878773_24

Weighting query terms based on distributional statistics

Abstract

Cite

Register to see more suggestions