WEFE: The word embeddings fairness evaluation framework

Pablo Badilla; Felipe Bravo-Marquez; Jorge Pérez

Conference Proceedings

WEFE: The word embeddings fairness evaluation framework

IJCAI International Joint Conference on Artificial Intelligence (2020) 2021-January 430-436

DOI: 10.24963/ijcai.2020/60

30Citations

52Readers

Get full text

Abstract

Word embeddings are known to exhibit stereotypical biases towards gender, race, religion, among other criteria. Several fairness metrics have been proposed in order to automatically quantify these biases. Although all metrics have a similar objective, the relationship between them is by no means clear. Two issues that prevent a clean comparison is that they operate with different inputs, and that their outputs are incompatible with each other. In this paper we propose WEFE, the word embeddings fairness evaluation framework, to encapsulate, evaluate and compare fairness metrics. Our framework needs a list of pre-trained embeddings and a set of fairness criteria, and it is based on checking correlations between fairness rankings induced by these criteria. We conduct a case study showing that rankings produced by existing fairness methods tend to correlate when measuring gender bias. This correlation is considerably less for other biases like race or religion. We also compare the fairness rankings with an embedding benchmark showing that there is no clear correlation between fairness and good performance in downstream tasks.

Cite

CITATION STYLE

APA

Badilla, P., Bravo-Marquez, F., & Pérez, J. (2020). WEFE: The word embeddings fairness evaluation framework. In IJCAI International Joint Conference on Artificial Intelligence (Vol. 2021-January, pp. 430–436). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2020/60

WEFE: The word embeddings fairness evaluation framework

Abstract

Cite

Register to see more suggestions