In this paper we describe the methodology, architecture and implementation of the information retrieval system we have developed for the Robust WSD Task at CLEF 2008. Our system is based on an extensive query preprocessing step for homogenisation of the corpus queries. The preprocessing of queries includes: firstly, an query expansion step based on Wordnet Synonsyms or an Associative Index, secondly a query translation step based on corpus article cooccurrence in Wikipedia, and thirdly a standard disjunct index search in the CLEF corpus. The crosslanguage enabled system behaves thereby as much as possible fair over different languages. We apply the same preprocessing steps, independent of the query and corpus language, to all queries.
CITATION STYLE
Juffinger, A., Kern, R., & Granitzer, M. (2008). Exploiting cooccurrence on corpus and document level for fair crosslanguage retrieval. In CEUR Workshop Proceedings (Vol. 1174). CEUR-WS.
Mendeley helps you to discover research relevant for your work.