Direct text classifier for thematic arabic discourse documents

10Citations
Citations of this article
19Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Maintaining the topical coherence while writing a discourse is a major challenge confronting novice and non-novice writers alike. This challenge is even more intense with Arabic discourse because of the complex morphology and the widespread of synonyms in Arabic language. In this research, we present a direct classification of Arabic discourse document while writing. This prescriptive proposed framework consists of the following stages: data collection, pre-processing, construction of Language Model (LM), topics identification, topics classification, and topic notification. To prove and demonstrate our proposed framework, we designed a system and applied it on a corpus of 2800 Arabic discourse documents synthesized into four predefined topics related to: Culture, Economy, Sport, and Religion. System performance was analysed, in terms of accuracy, recall, precision, and F-measure. The results demonstrated that the proposed topic modeling-based decision framework is able to classify topics while writing a discourse with accuracy of 91.0%.

Cite

CITATION STYLE

APA

Nahar, K., Al-Khatib, R., Al-Shannaq, M., Daradkeh, M., & Malkawi, R. (2020). Direct text classifier for thematic arabic discourse documents. International Arab Journal of Information Technology, 17(3), 394–403. https://doi.org/10.34028/iajit/17/3/13

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free