WEFE: The word embeddings fairness evaluation framework

30Citations
Citations of this article
52Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Word embeddings are known to exhibit stereotypical biases towards gender, race, religion, among other criteria. Several fairness metrics have been proposed in order to automatically quantify these biases. Although all metrics have a similar objective, the relationship between them is by no means clear. Two issues that prevent a clean comparison is that they operate with different inputs, and that their outputs are incompatible with each other. In this paper we propose WEFE, the word embeddings fairness evaluation framework, to encapsulate, evaluate and compare fairness metrics. Our framework needs a list of pre-trained embeddings and a set of fairness criteria, and it is based on checking correlations between fairness rankings induced by these criteria. We conduct a case study showing that rankings produced by existing fairness methods tend to correlate when measuring gender bias. This correlation is considerably less for other biases like race or religion. We also compare the fairness rankings with an embedding benchmark showing that there is no clear correlation between fairness and good performance in downstream tasks.

Cite

CITATION STYLE

APA

Badilla, P., Bravo-Marquez, F., & Pérez, J. (2020). WEFE: The word embeddings fairness evaluation framework. In IJCAI International Joint Conference on Artificial Intelligence (Vol. 2021-January, pp. 430–436). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2020/60

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free