Evaluating AI-Generated Emails: A Comparative Efficiency Analysis

3Citations
Citations of this article
19Readers
Mendeley users who have this article in their library.

Abstract

This study investigates the efficiency of large language models (LLMs) in producing routine, negative, and persuasive business emails for educational purposes within the context of Business Writing. Specifically, it compares the outputs generated by four widely-used LLMs (ChatGPT 3.5, Llama 2, Bing Chat, and Bard) when presented with identical email scenarios. These generated emails are evaluated using an elaborate rubric, allowing for a systematic assessment of LLMs' performance across three distinct email types. The results of the study show that the output with the same prompt varies greatly despite the rather formulaic nature of business emails. For instance, some LLMs struggle with following the requested structure and maintaining consistency in tone, while others have issues with unity and conciseness. The findings of this research hold implications for teaching business writing (rubrics, task instructions, in-class implementation), as well as for the integration of AI in professional communication at large.

Cite

CITATION STYLE

APA

Jovic, M., & Mnasri, S. (2024). Evaluating AI-Generated Emails: A Comparative Efficiency Analysis. World Journal of English Language, 14(2), 502–517. https://doi.org/10.5430/wjel.v14n2p502

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free