Bad seeds: Evaluating lexical methods for bias measurement

Maria Antoniak; David Mimno

Conference ProceedingsOPEN ACCESS

Bad seeds: Evaluating lexical methods for bias measurement

ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference (2021) 1889-1904

DOI: 10.18653/v1/2021.acl-long.148

50Citations

81Readers

Abstract

A common factor in bias measurement methods is the use of hand-curated seed lexicons, but there remains little guidance for their selection. We gather seeds used in prior work, documenting their common sources and rationales, and in case studies of three English-language corpora, we enumerate the different types of social biases and linguistic features that, once encoded in the seeds, can affect subsequent bias measurements. Seeds developed in one context are often re-used in other contexts, but documentation and evaluation remain necessary precursors to relying on seeds for sensitive measurements.

Cite

CITATION STYLE

APA

Antoniak, M., & Mimno, D. (2021). Bad seeds: Evaluating lexical methods for bias measurement. In ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference (pp. 1889–1904). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.acl-long.148

Bad seeds: Evaluating lexical methods for bias measurement

Abstract

Cite

Register to see more suggestions