Learning from Discriminatory Training Data

0Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Supervised learning systems are trained using historical data and, if the data was tainted by discrimination, they may unintentionally learn to discriminate against protected groups. We propose that fair learning methods, despite training on potentially discriminatory datasets, shall perform well on fair test datasets. Such dataset shifts crystallize application scenarios for specific fair learning methods. For instance, the removal of direct discrimination can be represented as a particular dataset shift problem. For this scenario, we propose a learning method that provably minimizes model error on fair datasets, while blindly training on datasets poisoned with direct additive discrimination. The method is compatible with existing legal systems and provides a solution to the widely discussed issue of protected groups' intersectionality by striking a balance between the protected groups. Technically, the method applies probabilistic interventions, has causal and counterfactual formulations, and is computationally lightweight - it can be used with any supervised learning model to prevent direct and indirect discrimination via proxies while maximizing model accuracy for business necessity.

Cite

CITATION STYLE

APA

Grabowicz, P., Perello, N., & Takatsu, K. (2023). Learning from Discriminatory Training Data. In AIES 2023 - Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society (pp. 752–763). Association for Computing Machinery, Inc. https://doi.org/10.1145/3600211.3604710

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free