Computational Morphology is an urgent problem for Arabic Natural Language Processing, because Arabic is a highly inflected language. We have found, however, that a full solution to this problem is not required for effective information retrieval. Light stemming allows remarkably good information retrieval without providing correct morphological analyses. We developed several light stemmers for Arabic, and assessed their effectiveness for information retrieval using standard TREC data. We have also compared light stemming with several stemmers based on morphological analysis. The light stemmer, light10, outperformed the other approaches. It has been included in the Lemur toolkit, and is becoming widely used Arabic information retrieval.
CITATION STYLE
Soudi, A., Neumann, G., & Bosch, A. van den. (2007). Arabic Computational Morphology: Knowledge-based and Empirical Methods. In Arabic Computational Morphology (pp. 3–14). Springer Netherlands. https://doi.org/10.1007/978-1-4020-6046-5_1
Mendeley helps you to discover research relevant for your work.