Abstract
Recent advances in Representation Learning have discovered a strong inclination for pre-trained word embeddings to demonstrate unfair and discriminatory gender stereotypes. These usually come in the shape of unjustified associations between representations of group words (e.g., male or female) and attribute words (e.g. driving, cooking, doctor, nurse, etc.) In this paper, we propose an iterative and adversarial procedure to reduce gender bias in word vectors. We aim to remove gender influence from word representations that should otherwise be free of it, while retaining meaningful gender information in words that are inherently charged with gender polarity (male or female). We confine these gender signals in a sub-vector of word embeddings to make them more interpretable. Quantitative and qualitative experiments confirm that our method successfully reduces gender bias in pre-trained word embeddings with minimal semantic offset.
Author supplied keywords
Cite
CITATION STYLE
Gaci, Y., Benatallah, B., Casati, F., & Benabdeslem, K. (2022). Iterative adversarial removal of gender bias in pretrained word embeddings. In Proceedings of the ACM Symposium on Applied Computing (pp. 829–836). Association for Computing Machinery. https://doi.org/10.1145/3477314.3507274
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.