Arabic Text Categorization Using Mixed Words

undefined; Mahmoud Hussein; Hamdy M. Mousa; Rouhia M. Sallam

Journal ArticleOPEN ACCESS

Arabic Text Categorization Using Mixed Words

Hussein M
Mousa H
et al.

International Journal of Information Technology and Computer Science (2016) 8(11) 74-81

DOI: 10.5815/ijitcs.2016.11.09

N/ACitations

6Readers

Abstract

—There is a tremendous number of Arabic text documents available online that is growing every day. Thus, categorizing these documents becomes very important. In this paper, an approach is proposed to enhance the accuracy of the Arabic text categorization. It is based on a new features representation technique that uses a mixture of a bag of words (BOW) and two adjacent words with different proportions. It also introduces a new features selection technique depends on Term Frequency (TF) and uses Frequency Ratio Accumulation Method (FRAM) as a classifier. Experiments are performed without both of normalization and stemming, with one of them, and with both of them. In addition, three data sets of different categories have been collected from online Arabic documents for evaluating the proposed approach. The highest accuracy obtained is 98.61% by the use of normalization.

Cite

CITATION STYLE

APA

Hussein, M., Mousa, H. M., & Sallam, R. M. (2016). Arabic Text Categorization Using Mixed Words. International Journal of Information Technology and Computer Science, 8(11), 74–81. https://doi.org/10.5815/ijitcs.2016.11.09

Arabic Text Categorization Using Mixed Words

Abstract

Cite

Register to see more suggestions