Effective Arabic Stemmer Based Hybrid Approach for Arabic Text Categorization

Meryeme Hadni; Said Alaoui Ouatik; Abdelmonaime Lachkar

Journal ArticleOPEN ACCESS

Effective Arabic Stemmer Based Hybrid Approach for Arabic Text Categorization

Hadni M
Ouatik S
Lachkar A

International Journal of Data Mining & Knowledge Management Process (2013) 3(4) 1-14

DOI: 10.5121/ijdkp.2013.3401

N/ACitations

25Readers

Abstract

Text pre-processing of Arabic Language is a challenge and crucial stage in Text Categorization (TC) particularly and Text Mining (TM) generally. Stemming algorithms can be employed in Arabic text pre-processing to reduces words to their stems/or root. Arabic stemming algorithms can be ranked, according to three category, as root-based approach (ex. Khoja); stem-based approach (ex. Larkey); and statistical approach (ex. N-Garm). However, no stemming of this language is perfect: The existing stemmers have a small efficiency. In this paper, in order to improve the accuracy of stemming and therefore the accuracy of our proposed TC system, an efficient hybrid method is proposed for stemming Arabic text. The effectiveness of the aforementioned four methods was evaluated and compared in term of the F-measure of the Naïve Bayesian classifier and the Support Vector Machine classifier used in our TC system. The proposed stemming algorithm was found to supersede the other stemming ones: The obtained results illustrate that using the proposed stemmer enhances greatly the performance of Arabic Text Categorization.

Cite

CITATION STYLE

APA

Hadni, M., Ouatik, S. A., & Lachkar, A. (2013). Effective Arabic Stemmer Based Hybrid Approach for Arabic Text Categorization. International Journal of Data Mining & Knowledge Management Process, 3(4), 1–14. https://doi.org/10.5121/ijdkp.2013.3401

Effective Arabic Stemmer Based Hybrid Approach for Arabic Text Categorization

Abstract

Cite

Register to see more suggestions