Towards Automatic Evaluation of NLG Tasks Using Conversational Large Language Models

0Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Evaluating the quality of machine generated open-ended texts is a long-standing challenge in Natural Language Processing (NLP). Even though there have been dramatic advancements in the machine learning technologies that propelled the research work concerning Natural Language Generation (NLG), a subdivision of NLP that focuses on text generation, a promising and widely adopted automatic evaluation technique for NLG tasks is yet to be developed. In this paper, we propose leveraging conversational Large Language Models (LLMs) as automatic evaluators for several open-ended NLG tasks. Our experiments with a recently released conversational LLM named ChatGPT demonstrate the viability of our proposal.

Author supplied keywords

Cite

CITATION STYLE

APA

Riyadh, M., & Shafiq, M. O. (2023). Towards Automatic Evaluation of NLG Tasks Using Conversational Large Language Models. In IFIP Advances in Information and Communication Technology (Vol. 676 IFIP, pp. 425–437). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-34107-6_34

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free