Detecting Artificially Generated Academic Text: The Importance of Mimicking Human Utilization of Large Language Models

0Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The advent of Large Language Models (LLMs) has led to a surge in Natural Language Generation (NLG), aiding humans in composing text for various tasks. However, there is a risk of these models being misused. For instance, detecting artificially generated text from original text is a concern in academia. Current research works on detection do not attempt to replicate how humans would use these models. In our work, we address this issue by leveraging data generated by mimicking how humans would use LLMs in composing academic works. Our study examines the detectability of the generated text using DetectGPT and GLTR, and we utilize state-of-the-art classification models like SciBERT, RoBERTa, DEBERTa, XLNet, and ELECTRA. Our experiments show that the generated text is difficult to detect using existing models when created using a LLM fine-tuned on the remainder of a paper. This highlights the importance of using realistic and challenging datasets in future research aimed at detecting artificially generated text.

Cite

CITATION STYLE

APA

Liyanage, V., & Buscaldi, D. (2023). Detecting Artificially Generated Academic Text: The Importance of Mimicking Human Utilization of Large Language Models. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13913 LNCS, pp. 558–565). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-35320-8_42

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free