Exploiting cooccurrence on corpus and document level for fair crosslanguage retrieval

ISSN: 16130073
2Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.

Abstract

In this paper we describe the methodology, architecture and implementation of the information retrieval system we have developed for the Robust WSD Task at CLEF 2008. Our system is based on an extensive query preprocessing step for homogenisation of the corpus queries. The preprocessing of queries includes: firstly, an query expansion step based on Wordnet Synonsyms or an Associative Index, secondly a query translation step based on corpus article cooccurrence in Wikipedia, and thirdly a standard disjunct index search in the CLEF corpus. The crosslanguage enabled system behaves thereby as much as possible fair over different languages. We apply the same preprocessing steps, independent of the query and corpus language, to all queries.

Cite

CITATION STYLE

APA

Juffinger, A., Kern, R., & Granitzer, M. (2008). Exploiting cooccurrence on corpus and document level for fair crosslanguage retrieval. In CEUR Workshop Proceedings (Vol. 1174). CEUR-WS.

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free