Large language models (LLMs) trained on real-world data can inadvertently reflect harmful societal biases, particularly toward historically marginalized communities. While previous work has primarily focused on harms related to age and race, emerging research has shown that biases toward disabled communities exist. This study extends prior work exploring the existence of harms by identifying categories of LLM-perpetuated harms toward the disability community. We conducted 19 focus groups, during which 56 participants with disabilities probed a dialog model about disability and discussed and annotated its responses. Participants rarely characterized model outputs as blatantly offensive or toxic. Instead, participants used nuanced language to detail how the dialog model mirrored subtle yet harmful stereotypes they encountered in their lives and dominant media, e.g., inspiration porn and able-bodied saviors. Participants often implicated training data as a cause for these stereotypes and recommended training the model on diverse identities from disability-positive resources. Our discussion further explores representative data strategies to mitigate harm related to different communities through annotation co-design with ML researchers and developers.
CITATION STYLE
Gadiraju, V., Kane, S., Dev, S., Taylor, A., Wang, D., Denton, E., & Brewer, R. (2023). “I wouldn’t say offensive but⋯”: Disability-Centered Perspectives on Large Language Models. In ACM International Conference Proceeding Series (pp. 205–216). Association for Computing Machinery. https://doi.org/10.1145/3593013.3593989
Mendeley helps you to discover research relevant for your work.