A survey on datasets for fairness-aware machine learning

Tai Le Quy; Arjun Roy; Vasileios Iosifidis; Wenbin Zhang; Eirini Ntoutsi

ArticleOPEN ACCESS

A survey on datasets for fairness-aware machine learning

Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery

DOI: 10.1002/widm.1452

267Citations

257Readers

Abstract

As decision-making increasingly relies on machine learning (ML) and (big) data, the issue of fairness in data-driven artificial intelligence systems is receiving increasing attention from both research and industry. A large variety of fairness-aware ML solutions have been proposed which involve fairness-related interventions in the data, learning algorithms, and/or model outputs. However, a vital part of proposing new approaches is evaluating them empirically on benchmark datasets that represent realistic and diverse settings. Therefore, in this paper, we overview real-world datasets used for fairness-aware ML. We focus on tabular data as the most common data representation for fairness-aware ML. We start our analysis by identifying relationships between the different attributes, particularly with respect to protected attributes and class attribute, using a Bayesian network. For a deeper understanding of bias in the datasets, we investigate interesting relationships using exploratory analysis. This article is categorized under: Commercial, Legal, and Ethical Issues > Fairness in Data Mining Fundamental Concepts of Data and Knowledge > Data Preprocessing.

Author supplied keywords

Cite

CITATION STYLE

APA

Le Quy, T., Roy, A., Iosifidis, V., Zhang, W., & Ntoutsi, E. (2022, May 1). A survey on datasets for fairness-aware machine learning. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery. John Wiley and Sons Inc. https://doi.org/10.1002/widm.1452

A survey on datasets for fairness-aware machine learning

Abstract

Author supplied keywords

Cite

Register to see more suggestions