An improved Arabic text classification method using word embedding

Tarik Sabri; Said Bahassine; Omar El Beggar; Mohamed Kissi

Journal ArticleOPEN ACCESS

An improved Arabic text classification method using word embedding

International Journal of Electrical and Computer Engineering (2024) 14(1) 721-731

DOI: 10.11591/ijece.v14i1.pp721-731

10Citations

27Readers

Abstract

Feature selection (FS) is a widely used method for removing redundant or irrelevant features to improve classification accuracy and decrease the model’s computational cost. In this paper, we present an improved method (referred to hereafter as RARF) for Arabic text classification (ATC) that employs the term frequency-inverse document frequency (TF-IDF) and Word2Vec embedding technique to identify words that have a particular semantic relationship. In addition, we have compared our method with four benchmark FS methods namely principal component analysis (PCA), linear discriminant analysis (LDA), chi-square, and mutual information (MI). Support vector machine (SVM), k-nearest neighbors (K-NN), and naive Bayes (NB) are three machine learning based algorithms used in this work. Two different Arabic datasets are utilized to perform a comparative analysis of these algorithms. This paper also evaluates the efficiency of our method for ATC on the basis of performance metrics viz accuracy, precision, recall, and F-measure. Results revealed that the highest accuracy achieved for the SVM classifier applied to the Khaleej-2004 Arabic dataset with 94.75%, while the same classifier recorded an accuracy of 94.01% for the Watan-2004 Arabic dataset.

Author supplied keywords

Cite

CITATION STYLE

APA

Sabri, T., Bahassine, S., El Beggar, O., & Kissi, M. (2024). An improved Arabic text classification method using word embedding. International Journal of Electrical and Computer Engineering, 14(1), 721–731. https://doi.org/10.11591/ijece.v14i1.pp721-731

An improved Arabic text classification method using word embedding

Abstract

Author supplied keywords

Cite

Register to see more suggestions