Transformer Based Model For Offensive Content Recognition In Dravidian Languages

0Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.

Abstract

This paper describes a model for spotting offensive data from the comments being collected from social media. The comments posted will include expressions, emoticons and will mostly be in code mixed language and classifying these code-mixed language comments is tricky. The proposed system uses a multi-head attention model to extract features from the code-mixed Tamil input data. Various classification algorithms are applied to these extracted features to categorize offensive comments. The generated labels are optimized by performing majority voting on labels generated by different algorithms. This system is validated on the validation set and is evaluated by applying the Tamil CodeMix test data from the dataset published by the HASOC task (Task2-subtask1) at FIRE 2021. The evaluation yields an average weighted F1 score of 0.83 and is ranked 3rd position in the official ranking.

Cite

CITATION STYLE

APA

Divya, S., & Sripriya, N. (2021). Transformer Based Model For Offensive Content Recognition In Dravidian Languages. In CEUR Workshop Proceedings (Vol. 3159, pp. 651–658). CEUR-WS. https://doi.org/10.34117/bjdv9n12-006

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free