Enhancing Cyberbullying Detection on Indonesian Twitter: Leveraging FastText for Feature Expansion and Hybrid Approach Applying CNN and BiLSTM

Muhammad Alfi Syahri Nasution; Erwin Budi Setiawan

Journal ArticleOPEN ACCESS

Enhancing Cyberbullying Detection on Indonesian Twitter: Leveraging FastText for Feature Expansion and Hybrid Approach Applying CNN and BiLSTM

Revue d'Intelligence Artificielle (2023) 37(4) 929-936

DOI: 10.18280/ria.370413

14Citations

69Readers

Abstract

Cyberbullying, characterized by the transmission of threatening, intimidating, and derogatory messages via digital platforms such as Twitter, is a pervasive issue. Given the volume of approximately 867 million daily tweets, the potential scale of cyberbullying incidents is immense, underscoring the necessity for automated detection systems for such messages. However, the context-sensitive nature of tweets can pose challenges to understanding message content, particularly in languages like Indonesian with potential for significant vocabulary discrepancies. This study aims to enhance cyberbullying detection by employing feature expansion using FastText, thereby addressing vocabulary-related comprehension issues in Indonesian-language tweets. Furthermore, text classification is performed using a Hybrid Deep Learning approach, integrating Convolutional Neural Networks (CNN) and Bidirectional Long Short-Term Memory (BiLSTM). This hybrid model leverages the strengths of both techniques, capturing local patterns and long-range dependencies within the data. The objective of this research is to evaluate the performance yielded by the application of FastText-enhanced feature expansion and Hybrid Deep Learning to an Indonesian Twitter dataset. This focus is motivated by the high accuracy of Hybrid Deep Learning for Twitter datasets in other languages, and the limited application of such methods to Indonesian-language datasets, which predominantly use supervised learning or deep learning. Analysis of 29,085 datasets demonstrated that the combined implementation of Hybrid Deep Learning and FastText-enhanced feature expansion achieved the highest accuracy, with CNN-BiLSTM and BiLSTM-CNN scoring 80.55% and 80.35% respectively. These findings validate the significant accuracy boost provided by FastText when integrated with Hybrid Deep Learning. It is anticipated that the outcomes of this study will facilitate the accurate identification and removal of cyberbullying tweets, thereby contributing to a safer digital communication environment on Twitter.

Author supplied keywords

Cite

CITATION STYLE

APA

Nasution, M. A. S., & Setiawan, E. B. (2023). Enhancing Cyberbullying Detection on Indonesian Twitter: Leveraging FastText for Feature Expansion and Hybrid Approach Applying CNN and BiLSTM. Revue d’Intelligence Artificielle, 37(4), 929–936. https://doi.org/10.18280/ria.370413

Enhancing Cyberbullying Detection on Indonesian Twitter: Leveraging FastText for Feature Expansion and Hybrid Approach Applying CNN and BiLSTM

Abstract

Author supplied keywords

Cite

Register to see more suggestions