Comparison of Online Gambling Promotion Detection Performance Using DistilBERT and DeBERTa Models

  • Pratama H
  • Wijayakusuma I
  • Widiastuti R
N/ACitations
Citations of this article
17Readers
Mendeley users who have this article in their library.

Abstract

Online gambling promotions on social media have become a serious concern in Indonesia, where perpetrators use ambiguous and disguised language to evade detection. This study compares two transformer-based models, DistilBERT and DeBERTa, in detecting such content within Indonesian YouTube comments. Using a balanced dataset of 6,350 comments, both models were fine-tuned with optimized hyperparameters (learning rate 1e-5, batch size 32, 5 epochs) and evaluated through five-fold cross-validation. Results show that DeBERTa achieves superior performance with 99.84% accuracy and perfect recall, while DistilBERT achieves 99.29% accuracy. Error and linguistic analyses indicate that DeBERTa’s disentangled attention and Byte-Pair Encoding provide better understanding of non-standard and ambiguous language. Despite requiring higher computational cost, DeBERTa is ideal for high-accuracy applications, whereas DistilBERT remains suitable for real-time and resource-limited environments.Online gambling promotions on social media have become a serious concern in Indonesia, where perpetrators use ambiguous and disguised language to evade detection. This study compares two transformer-based models, DistilBERT and DeBERTa, in detecting such content within Indonesian YouTube comments. Using a balanced dataset of 6,350 comments, both models were fine-tuned with optimized hyperparameters (learning rate 1e-5, batch size 32, 5 epochs) and evaluated through five-fold cross-validation. Results show that DeBERTa achieves superior performance with 99.84% accuracy and perfect recall, while DistilBERT achieves 99.29% accuracy. Error and linguistic analyses indicate that DeBERTa’s disentangled attention and Byte-Pair Encoding provide better understanding of non-standard and ambiguous language. Despite requiring higher computational cost, DeBERTa is ideal for high-accuracy applications, whereas DistilBERT remains suitable for real-time and resource-limited environments.

Cite

CITATION STYLE

APA

Pratama, H. M., Wijayakusuma, I. L., & Widiastuti, R. S. (2025). Comparison of Online Gambling Promotion Detection Performance Using DistilBERT and DeBERTa Models. Journal of Applied Informatics and Computing, 9(6), 3716–3725. https://doi.org/10.30871/jaic.v9i6.11293

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free