Machine learning algorithms are increasingly involved in sensitive decision-making processes with adverse implications on individuals. This paper presents mdfa, an approach that identifies the characteristics of the victims of a classifier's discrimination. We measure discrimination as a violation of multi-differential fairness. Multi-differential fairness is a guarantee that a black box classifier's outcomes do not leak information on the sensitive attributes of a small group of individuals. We reduce the problem of identifying worst-case violations to matching distributions and predicting where sensitive attributes and classifier's outcomes coincide. We apply mdfa to a recidivism risk assessment classifier and demonstrate that for individuals with little criminal history, identified African-Americans are three-times more likely to be considered at high risk of violent recidivism than similar non-African-Americans.
CITATION STYLE
Gitiaux, X., & Rangwala, H. (2019). MDFA: Multi-differential fairness auditor for black box classifiers. In IJCAI International Joint Conference on Artificial Intelligence (Vol. 2019-August, pp. 5871–5879). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2019/814
Mendeley helps you to discover research relevant for your work.