On the Interpretability and Significance of Bias Metrics in Texts: a PMI-based Approach

4Citations
Citations of this article
17Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In recent years, word embeddings have been widely used to measure biases in texts. Even if they have proven to be effective in detecting a wide variety of biases, metrics based on word embeddings lack transparency and interpretability. We analyze an alternative PMI-based metric to quantify biases in texts. It can be expressed as a function of conditional probabilities, which provides a simple interpretation in terms of word co-occurrences. We also prove that it can be approximated by an odds ratio, which allows estimating confidence intervals and statistical significance of textual biases. This approach produces similar results to metrics based on word embeddings when capturing gender gaps of the real world embedded in large corpora.

Cite

CITATION STYLE

APA

Valentini, F., Rosati, G., Blasi, D., Slezak, D. F., & Altszyler, E. (2023). On the Interpretability and Significance of Bias Metrics in Texts: a PMI-based Approach. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Vol. 2, pp. 509–520). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.acl-short.44

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free