Detecting Artificially Generated Academic Text: The Importance of Mimicking Human Utilization of Large Language Models

Vijini Liyanage; Davide Buscaldi

Conference Proceedings

Detecting Artificially Generated Academic Text: The Importance of Mimicking Human Utilization of Large Language Models

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2023) 13913 LNCS 558-565

DOI: 10.1007/978-3-031-35320-8_42

0Citations

4Readers

Get full text

Abstract

The advent of Large Language Models (LLMs) has led to a surge in Natural Language Generation (NLG), aiding humans in composing text for various tasks. However, there is a risk of these models being misused. For instance, detecting artificially generated text from original text is a concern in academia. Current research works on detection do not attempt to replicate how humans would use these models. In our work, we address this issue by leveraging data generated by mimicking how humans would use LLMs in composing academic works. Our study examines the detectability of the generated text using DetectGPT and GLTR, and we utilize state-of-the-art classification models like SciBERT, RoBERTa, DEBERTa, XLNet, and ELECTRA. Our experiments show that the generated text is difficult to detect using existing models when created using a LLM fine-tuned on the remainder of a paper. This highlights the importance of using realistic and challenging datasets in future research aimed at detecting artificially generated text.

Author supplied keywords

Cite

CITATION STYLE

APA

Liyanage, V., & Buscaldi, D. (2023). Detecting Artificially Generated Academic Text: The Importance of Mimicking Human Utilization of Large Language Models. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13913 LNCS, pp. 558–565). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-35320-8_42

Detecting Artificially Generated Academic Text: The Importance of Mimicking Human Utilization of Large Language Models

Abstract

Author supplied keywords

Cite

Register to see more suggestions