Term conflation and blind relevance feedback for information retrieval on Indian languages

0Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.
Get full text

Abstract

For the first participation of Dublin City University (DCU) in the FIRE 2010 evaluation campaign, Information Retrieval (IR) experiments on English, Bengali, Hindi, and Marathi documents were performed to investigate term conflation, Blind Relevance Feedback (BRF), and manual and automatic query translation. The experiments are based on BM25 and on language modeling (LM) for IR. Results show that term conflation always improves Mean Average Precision (MAP) compared to indexing unprocessed word forms, but different approaches seem to work best for different languages. For example, in monolingual Marathi experiments indexing 5-prefixes outperforms our corpus-based stemmer; in Hindi, corpus-based stemming approach achieves a higher MAP. For Bengali, the LM retrieval model with the rule based stemmer achieves a higher (but not significantly higher) MAP than BM25 with a corpus based stemmer (0.4583 vs. 0.4526). In all experiments, BRF yields considerably higher MAP in comparison to experiments without it. Bilingual IR experiments (English to Bengali and English to Hindi) are based on query translations obtained from native speakers and the Google translate web service. For the automatically translated queries, MAP is slightly (but not significantly) lower compared to experiments with manual query translations. The bilingual English to Bengali (English to Hindi) experiments achieve 81.7%-83.3% (78.0%-80.6%) of the best corresponding monolingual experiments.

Cite

CITATION STYLE

APA

Leveling, J., Ganguly, D., & Jones, G. J. F. (2013). Term conflation and blind relevance feedback for information retrieval on Indian languages. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7536 LNCS, pp. 295–309). https://doi.org/10.1007/978-3-642-40087-2_28

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free