This research investigates the application of Model-Agnostic Meta-Learning (MAML) and ProtoMAML to identify offensive code-mixed text content on social media in Tamil-English and Malayalam-English code-mixed texts. We follow a two-step strategy: The XLM-RoBERTa (XLM-R) model is trained using the meta-learning algorithms on a variety of tasks having code-mixed data, monolingual data in the same language as the target language and related tasks in other languages. The model is then fine-tuned on target tasks to identify offensive language in Malayalam-English and Tamil-English code-mixed texts. Our results show that meta-learning improves the performance of models significantly in low-resource (few-shot learning) tasks1. We also introduce a weighted data sampling approach which helps the model converge better in the meta-training phase compared to traditional methods.
CITATION STYLE
Suresh, G. V., Chakravarthi, B. R., & McCrae, J. P. (2021). Meta-Learning for Offensive Language Detection in Code-Mixed Texts. In ACM International Conference Proceeding Series (pp. 58–66). Association for Computing Machinery. https://doi.org/10.1145/3503162.3503167
Mendeley helps you to discover research relevant for your work.