Danish and Greek web search experiments with hummingbird SearchServer™ at CLEF 2005

2Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Hummingbird participated in the WebCLEF mixed monolingual retrieval task of the Cross-Language Evaluation Forum (CLEF) 2005. In this task, the system was given 547 known-item queries from 11 languages (134 Spanish, 121 English, 59 Dutch, 59 Portuguese, 57 German, 35 Hungarian, 30 Danish, 30 Russian, 16 Greek, 5 Icelandic and 1 French). The goal was to find the desired page in the 82GB EuroGOV collection (3.4 million pages crawled from government sites of 27 European domains). Our experiments found that stopword processing was more important than anticipated, perhaps because words common in one language may tend to be overweighted by inverse document frequency in a mixed language collection. Extra weight on the document title helped significantly, and extra weight on less deep urls significantly helped home page queries. Stemming was of neutral impact on average, but it made a substantial difference for some individual queries. We analyze several Danish and Greek queries in detail. © Springer-Verlag Berlin Heidelberg 2006.

Cite

CITATION STYLE

APA

Tomlinson, S. (2006). Danish and Greek web search experiments with hummingbird SearchServerTM at CLEF 2005. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4022 LNCS, pp. 846–855). Springer Verlag. https://doi.org/10.1007/11878773_92

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free