This paper describes our approach to the 2006 Adhoc Monolingual Information Retrieval run for French. The goal of our experiment was to compare the performance of a proposed statistical stemmer with that of a rule-based stemmer, specifically the French version of Porter's stemmer. The statistical stemming approach is based on lexicon clustering, using a novel string distance measure. We submitted three official runs, besides a baseline run that uses no stemming. The results show that stemming significantly improves retrieval performance (as expected) by about 9-10%, and the performance of the statistical stemmer is comparable with that of the rule-based stemmer. © Springer-Verlag Berlin Heidelberg 2007.
CITATION STYLE
Majumder, P., Mitra, M., & Datta, K. (2007). Statistical vs. rule-based stemming for monolingual French retrieval. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4730 LNCS, pp. 107–110). Springer Verlag. https://doi.org/10.1007/978-3-540-74999-8_14
Mendeley helps you to discover research relevant for your work.