Credible Without Credit: Domain Experts Assess Generative Language Models

27Citations
Citations of this article
23Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Language models have recently broken into the public consciousness with the release of the wildly popular ChatGPT. Commentators have argued that language models could replace search engines, make college essays obsolete, or even write academic research papers. All of these tasks rely on accuracy of specialized information which can be difficult to assess for non-experts. Using 10 domain experts across science and culture, we provide an initial assessment of the coherence, conciseness, accuracy, and sourcing of two language models across 100 expert-written questions. While we find the results are consistently cohesive and concise, we find that they are mixed in their accuracy. These results raise questions of the role language models should play in general-purpose and expert knowledge seeking.

Cite

CITATION STYLE

APA

Peskoff, D., & Stewart, B. M. (2023). Credible Without Credit: Domain Experts Assess Generative Language Models. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Vol. 2, pp. 427–438). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.acl-short.37

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free