SNAD Arabic Dataset for Deep Learning

Deem AlSaleh; Mashael Bin AlAmir; Souad Larabi-Marie-Sainte

Conference Proceedings

SNAD Arabic Dataset for Deep Learning

Advances in Intelligent Systems and Computing (2021) 1250 AISC 630-640

DOI: 10.1007/978-3-030-55180-3_47

2Citations

3Readers

Get full text

Abstract

Natural language processing (NLP) captured the attention of researchers for the last years. NLP is applied in various applications and several disciplines. Arabic is a language that also benefited from NLP. However, only few Arabic datasets are available for researchers. For that, applying the Arabic NLP is limited in these datasets. Hence, this paper introduces a new dataset, SNAD. SNAD is collected to fill the gap in Arabic datasets, especially for classification using deep learning. The dataset has more than 45,000 records. Each record consists of the news title, news details, in addition to the news class. The dataset has six different classes. Moreover, cleaning and preprocessing are applied to the raw data to make it more efficient for classification purpose. Finally, the dataset is validated using the Convolutional Neural Networks and the result is efficient. The dataset is freely available online.

Author supplied keywords

Cite

CITATION STYLE

APA

AlSaleh, D., AlAmir, M. B., & Larabi-Marie-Sainte, S. (2021). SNAD Arabic Dataset for Deep Learning. In Advances in Intelligent Systems and Computing (Vol. 1250 AISC, pp. 630–640). Springer. https://doi.org/10.1007/978-3-030-55180-3_47

SNAD Arabic Dataset for Deep Learning

Abstract

Author supplied keywords

Cite

Register to see more suggestions