Exploiting cooccurrence on corpus and document level for fair crosslanguage retrieval

Andreas Juffinger; Roman Kern; Michael Granitzer

Conference Proceedings

Exploiting cooccurrence on corpus and document level for fair crosslanguage retrieval

CEUR Workshop Proceedings (2008) 1174

ISSN: 16130073

2Citations

10Readers

Abstract

In this paper we describe the methodology, architecture and implementation of the information retrieval system we have developed for the Robust WSD Task at CLEF 2008. Our system is based on an extensive query preprocessing step for homogenisation of the corpus queries. The preprocessing of queries includes: firstly, an query expansion step based on Wordnet Synonsyms or an Associative Index, secondly a query translation step based on corpus article cooccurrence in Wikipedia, and thirdly a standard disjunct index search in the CLEF corpus. The crosslanguage enabled system behaves thereby as much as possible fair over different languages. We apply the same preprocessing steps, independent of the query and corpus language, to all queries.

Author supplied keywords

Cite

CITATION STYLE

APA

Juffinger, A., Kern, R., & Granitzer, M. (2008). Exploiting cooccurrence on corpus and document level for fair crosslanguage retrieval. In CEUR Workshop Proceedings (Vol. 1174). CEUR-WS.

Exploiting cooccurrence on corpus and document level for fair crosslanguage retrieval

Abstract

Author supplied keywords

Cite

Register to see more suggestions