SNAD Arabic Dataset for Deep Learning

2Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Natural language processing (NLP) captured the attention of researchers for the last years. NLP is applied in various applications and several disciplines. Arabic is a language that also benefited from NLP. However, only few Arabic datasets are available for researchers. For that, applying the Arabic NLP is limited in these datasets. Hence, this paper introduces a new dataset, SNAD. SNAD is collected to fill the gap in Arabic datasets, especially for classification using deep learning. The dataset has more than 45,000 records. Each record consists of the news title, news details, in addition to the news class. The dataset has six different classes. Moreover, cleaning and preprocessing are applied to the raw data to make it more efficient for classification purpose. Finally, the dataset is validated using the Convolutional Neural Networks and the result is efficient. The dataset is freely available online.

Cite

CITATION STYLE

APA

AlSaleh, D., AlAmir, M. B., & Larabi-Marie-Sainte, S. (2021). SNAD Arabic Dataset for Deep Learning. In Advances in Intelligent Systems and Computing (Vol. 1250 AISC, pp. 630–640). Springer. https://doi.org/10.1007/978-3-030-55180-3_47

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free