Towards Automatic Evaluation of NLG Tasks Using Conversational Large Language Models

Md Riyadh; M. Omair Shafiq

Conference Proceedings

Towards Automatic Evaluation of NLG Tasks Using Conversational Large Language Models

IFIP Advances in Information and Communication Technology (2023) 676 IFIP 425-437

DOI: 10.1007/978-3-031-34107-6_34

0Citations

6Readers

Get full text

Abstract

Evaluating the quality of machine generated open-ended texts is a long-standing challenge in Natural Language Processing (NLP). Even though there have been dramatic advancements in the machine learning technologies that propelled the research work concerning Natural Language Generation (NLG), a subdivision of NLP that focuses on text generation, a promising and widely adopted automatic evaluation technique for NLG tasks is yet to be developed. In this paper, we propose leveraging conversational Large Language Models (LLMs) as automatic evaluators for several open-ended NLG tasks. Our experiments with a recently released conversational LLM named ChatGPT demonstrate the viability of our proposal.

Author supplied keywords

Cite

CITATION STYLE

APA

Riyadh, M., & Shafiq, M. O. (2023). Towards Automatic Evaluation of NLG Tasks Using Conversational Large Language Models. In IFIP Advances in Information and Communication Technology (Vol. 676 IFIP, pp. 425–437). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-34107-6_34

Towards Automatic Evaluation of NLG Tasks Using Conversational Large Language Models

Abstract

Author supplied keywords

Cite

Register to see more suggestions