Arabic Text Categorization Using Mixed Words

  • Hussein M
  • Mousa H
  • et al.
N/ACitations
Citations of this article
6Readers
Mendeley users who have this article in their library.

Abstract

—There is a tremendous number of Arabic text documents available online that is growing every day. Thus, categorizing these documents becomes very important. In this paper, an approach is proposed to enhance the accuracy of the Arabic text categorization. It is based on a new features representation technique that uses a mixture of a bag of words (BOW) and two adjacent words with different proportions. It also introduces a new features selection technique depends on Term Frequency (TF) and uses Frequency Ratio Accumulation Method (FRAM) as a classifier. Experiments are performed without both of normalization and stemming, with one of them, and with both of them. In addition, three data sets of different categories have been collected from online Arabic documents for evaluating the proposed approach. The highest accuracy obtained is 98.61% by the use of normalization.

Cite

CITATION STYLE

APA

Hussein, M., Mousa, H. M., & Sallam, R. M. (2016). Arabic Text Categorization Using Mixed Words. International Journal of Information Technology and Computer Science, 8(11), 74–81. https://doi.org/10.5815/ijitcs.2016.11.09

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free