Abstract
Online gambling promotions on social media have become a serious concern in Indonesia, where perpetrators use ambiguous and disguised language to evade detection. This study compares two transformer-based models, DistilBERT and DeBERTa, in detecting such content within Indonesian YouTube comments. Using a balanced dataset of 6,350 comments, both models were fine-tuned with optimized hyperparameters (learning rate 1e-5, batch size 32, 5 epochs) and evaluated through five-fold cross-validation. Results show that DeBERTa achieves superior performance with 99.84% accuracy and perfect recall, while DistilBERT achieves 99.29% accuracy. Error and linguistic analyses indicate that DeBERTa’s disentangled attention and Byte-Pair Encoding provide better understanding of non-standard and ambiguous language. Despite requiring higher computational cost, DeBERTa is ideal for high-accuracy applications, whereas DistilBERT remains suitable for real-time and resource-limited environments.Online gambling promotions on social media have become a serious concern in Indonesia, where perpetrators use ambiguous and disguised language to evade detection. This study compares two transformer-based models, DistilBERT and DeBERTa, in detecting such content within Indonesian YouTube comments. Using a balanced dataset of 6,350 comments, both models were fine-tuned with optimized hyperparameters (learning rate 1e-5, batch size 32, 5 epochs) and evaluated through five-fold cross-validation. Results show that DeBERTa achieves superior performance with 99.84% accuracy and perfect recall, while DistilBERT achieves 99.29% accuracy. Error and linguistic analyses indicate that DeBERTa’s disentangled attention and Byte-Pair Encoding provide better understanding of non-standard and ambiguous language. Despite requiring higher computational cost, DeBERTa is ideal for high-accuracy applications, whereas DistilBERT remains suitable for real-time and resource-limited environments.
Cite
CITATION STYLE
Pratama, H. M., Wijayakusuma, I. L., & Widiastuti, R. S. (2025). Comparison of Online Gambling Promotion Detection Performance Using DistilBERT and DeBERTa Models. Journal of Applied Informatics and Computing, 9(6), 3716–3725. https://doi.org/10.30871/jaic.v9i6.11293
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.