Abstract
Ranking evaluation metrics play an important role in information retrieval, providing optimization objectives during development and means of assessment of deployed performance. Recently, fairness of rankings has been recognized as crucial, especially as automated systems are increasingly used for high impact decisions. While numerous fairness metrics have been proposed, a comparative analysis to understand their interrelationships is lacking. Even for fundamental statistical parity metrics which measure group advantage, it remains unclear whether metrics measure the same phenomena, or when one metric may produce different results than another. To address these open questions, we formulate a conceptual framework for analytical comparison of metrics. We prove that under reasonable assumptions, popular metrics in the literature exhibit the same behavior and that optimizing for one optimizes for all. However, our analysis also shows that the metrics vary in the degree of unfairness measured, in particular when one group has a strong majority. Based on this analysis, we design a practical statistical test to identify whether observed data is likely to exhibit predictable group bias. We provide a set of recommendations for practitioners to guide the choice of an appropriate fairness metric.
Author supplied keywords
Cite
CITATION STYLE
Kuhlman, C., Gerych, W., & Rundensteiner, E. (2021). Measuring Group Advantage: A Comparative Study of Fair Ranking Metrics. In AIES 2021 - Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society (pp. 674–682). Association for Computing Machinery, Inc. https://doi.org/10.1145/3461702.3462588
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.