Abstract
One widely cited barrier to the adoption of LLMs as proxies for humans in subjective tasks is their sensitivity to prompt wording— but interestingly, humans also display sensi-tivities to instruction changes in the form of response biases. We investigate the extent to which LLMs reflect human response biases, if at all. We look to survey design, where human response biases caused by changes in the wordings of ‘‘prompts’’ have been extensively explored in social psychology literature. Draw-ing from these works, we design a dataset and framework to evaluate whether LLMs exhibit human-like response biases in survey question-naires. Our comprehensive evaluation of nine models shows that popular open and commer-cial LLMs generally fail to reflect human-like behavior, particularly in models that have un-dergone RLHF. Furthermore, even if a model shows a significant change in the same direc-tion as humans, we find that they are sensitive to perturbations that do not elicit significant changes in humans. These results highlight the pitfalls of using LLMs as human proxies, and underscore the need for finer-grained charac-terizations of model behavior.1.
Cite
CITATION STYLE
Tjuatja, L., Chen, V., Wu, T., Talwalkwar, A., & Neubig, G. (2024). Do LLMs Exhibit Human-like Response Biases? A Case Study in Survey Design. Transactions of the Association for Computational Linguistics, 12, 1011–1026. https://doi.org/10.1162/tacl_a_00685
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.