eevvgg at SemEval-2023 Task 11: Offensive Language Classification with Rater-based Information

Ewelina Gajewska

Conference ProceedingsOPEN ACCESS

eevvgg at SemEval-2023 Task 11: Offensive Language Classification with Rater-based Information

Gajewska E

17th International Workshop on Semantic Evaluation, SemEval 2023 - Proceedings of the Workshop (2023) 171-176

DOI: 10.18653/v1/2023.semeval-1.24

3Citations

10Readers

Abstract

A standard majority-based approach to text classification is challenged with an individualised approach in the Semeval-2023 Task 11. Here, disagreements are treated as a useful source of information that could be utilised in the training pipeline. The team proposal makes use of partially disaggregated data and additional information about annotators provided by the organisers to train a BERT-based model for offensive text classification. The approach extends previous studies examining the impact of using raters’ demographic features on classification performance (Hovy, 2015) or training machine learning models on disaggregated data (Davani et al., 2022). The proposed approach was ranked 11 across all 4 datasets, scoring best for cases with a large pool of annotators (6th place in the MD-Agreement dataset) utilising features based on raters’ annotation behaviour.

Cite

CITATION STYLE

APA

Gajewska, E. (2023). eevvgg at SemEval-2023 Task 11: Offensive Language Classification with Rater-based Information. In 17th International Workshop on Semantic Evaluation, SemEval 2023 - Proceedings of the Workshop (pp. 171–176). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.semeval-1.24

eevvgg at SemEval-2023 Task 11: Offensive Language Classification with Rater-based Information

Abstract

Cite

Register to see more suggestions