Deep Learning-based Analysis of Algerian Dialect Dataset Targeted Hate Speech, Offensive Language and Cyberbullying

21Citations
Citations of this article
63Readers
Mendeley users who have this article in their library.

Abstract

Toxicity and hate speech on social media platforms can lead to cyber-crime, affecting social life on a personal and community level. Therefore, automatic toxicity and hateful content detection are necessary to enhance web content quality and fight against inappropriate speech spread through social media. This need is also a challenge when comments are posted and written in complex languages, such as Arabic, which is recognised for its difficulties and lack of resources. This paper introduces a new dataset for Algerian dialect toxic text detection, whereby we build an annotated multi-label dataset consisting of 14150 comments extracted from Facebook, YouTube and Twitter, and labelled as hate speech, offensive language and cyberbullying. To assess the practical utility of the created annotated dataset, several tests have been conducted using many classification models of traditional machine learning (ML), namely, Random Forest, Naïve Bayes, Linear Support Vector (SVC), Stochastic Gradient Descent (SGD) and Logistic Regression. Furthermore, several assessments have been conducted using Deep Learning (DL) models such as Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), Bidirectional-LSTM (Bi-LSTM) and Bidirectional-GRU (Bi-GRU). Experimental tests demonstrate the success of the Bi-GRU model, which achieved the highest results for DL classification, with 73.6% Accuracy and 75.8% F1-Score.

References Powered by Scopus

Deep learning for hate speech detection in tweets

919Citations
N/AReaders
Get full text

A BERT-Based Transfer Learning Approach for Hate Speech Detection in Online Social Media

268Citations
N/AReaders
Get full text

Effective hate-speech detection in Twitter data using recurrent neural networks

206Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Automatic speech recognition using advanced deep learning approaches: A survey

47Citations
N/AReaders
Get full text

A comparative analysis of machine learning algorithms for hate speech detection in social media

9Citations
N/AReaders
Get full text

Noise-Robust Speech Recognition: A Comparative Analysis of LSTM and CNN Approaches

7Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Mazari, A. C., & Kheddar, H. (2023). Deep Learning-based Analysis of Algerian Dialect Dataset Targeted Hate Speech, Offensive Language and Cyberbullying. International Journal of Computing and Digital Systems, 13(1), 965–972. https://doi.org/10.12785/ijcds/130177

Readers over time

‘23‘24‘2509182736

Readers' Seniority

Tooltip

Lecturer / Post doc 9

35%

Researcher 9

35%

Professor / Associate Prof. 4

15%

PhD / Post grad / Masters / Doc 4

15%

Readers' Discipline

Tooltip

Computer Science 21

75%

Linguistics 3

11%

Business, Management and Accounting 2

7%

Social Sciences 2

7%

Save time finding and organizing research with Mendeley

Sign up for free
0