Meta-Learning for Offensive Language Detection in Code-Mixed Texts

6Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.

Abstract

This research investigates the application of Model-Agnostic Meta-Learning (MAML) and ProtoMAML to identify offensive code-mixed text content on social media in Tamil-English and Malayalam-English code-mixed texts. We follow a two-step strategy: The XLM-RoBERTa (XLM-R) model is trained using the meta-learning algorithms on a variety of tasks having code-mixed data, monolingual data in the same language as the target language and related tasks in other languages. The model is then fine-tuned on target tasks to identify offensive language in Malayalam-English and Tamil-English code-mixed texts. Our results show that meta-learning improves the performance of models significantly in low-resource (few-shot learning) tasks1. We also introduce a weighted data sampling approach which helps the model converge better in the meta-training phase compared to traditional methods.

Cite

CITATION STYLE

APA

Suresh, G. V., Chakravarthi, B. R., & McCrae, J. P. (2021). Meta-Learning for Offensive Language Detection in Code-Mixed Texts. In ACM International Conference Proceeding Series (pp. 58–66). Association for Computing Machinery. https://doi.org/10.1145/3503162.3503167

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free