Fine-Grained Multi-label Sexism Classification Using a Semi-Supervised Multi-level Neural Approach

Harika Abburi; Pulkit Parikh; Niyati Chhaya; Vasudeva Varma

Journal ArticleOPEN ACCESS

Fine-Grained Multi-label Sexism Classification Using a Semi-Supervised Multi-level Neural Approach

Data Science and Engineering (2021) 6(4) 359-379

DOI: 10.1007/s41019-021-00168-y

19Citations

77Readers

Abstract

Sexism, a permeate form of oppression, causes profound suffering through various manifestations. Given the increasing number of experiences of sexism shared online, categorizing these recollections automatically can support the battle against sexism, since it can promote successful evaluations by gender studies researchers and government representatives engaged in policy making. In this paper, we examine the fine-grained, multi-label classification of accounts (reports) of sexism. To the best of our knowledge, we consider substantially more categories of sexism than any related prior work through our 23-class problem formulation. Moreover, we present the first semi-supervised work for the multi-label classification of accounts describing any type(s) of sexism. We devise self-training-based techniques tailor-made for the multi-label nature of the problem to utilize unlabeled samples for augmenting the labeled set. We identify high textual diversity with respect to the existing labeled set as a desirable quality for candidate unlabeled instances and develop methods for incorporating it into our approach. We also explore ways of infusing class imbalance alleviation for multi-label classification into our semi-supervised learning, independently and in conjunction with the method involving diversity. In addition to data augmentation methods, we develop a neural model which combines biLSTM and attention with a domain-adapted BERT model in an end-to-end trainable manner. Further, we formulate a multi-level training approach in which models are sequentially trained using categories of sexism of different levels of granularity. Moreover, we devise a loss function that exploits any label confidence scores associated with the data. Several proposed methods outperform various baselines on a recently released dataset for multi-label sexism categorization across several standard metrics.

Author supplied keywords

Cite

CITATION STYLE

APA

Abburi, H., Parikh, P., Chhaya, N., & Varma, V. (2021). Fine-Grained Multi-label Sexism Classification Using a Semi-Supervised Multi-level Neural Approach. Data Science and Engineering, 6(4), 359–379. https://doi.org/10.1007/s41019-021-00168-y

Fine-Grained Multi-label Sexism Classification Using a Semi-Supervised Multi-level Neural Approach

Abstract

Author supplied keywords

Cite

Register to see more suggestions