A Benchmark Dataset to Distinguish Human-Written and Machine-Generated Scientific Papers †

Mohamed Hesham Ibrahim Abdalla; Simon Malberg; Daryna Dementieva; Edoardo Mosca; Georg Groh

Journal ArticleOPEN ACCESS

A Benchmark Dataset to Distinguish Human-Written and Machine-Generated Scientific Papers †

Information (Switzerland) (2023) 14(10)

DOI: 10.3390/info14100522

5Citations

28Readers

Abstract

As generative NLP can now produce content nearly indistinguishable from human writing, it is becoming difficult to identify genuine research contributions in academic writing and scientific publications. Moreover, information in machine-generated text can be factually wrong or even entirely fabricated. In this work, we introduce a novel benchmark dataset containing human-written and machine-generated scientific papers from SCIgen, GPT-2, GPT-3, ChatGPT, and Galactica, as well as papers co-created by humans and ChatGPT. We also experiment with several types of classifiers—linguistic-based and transformer-based—for detecting the authorship of scientific text. A strong focus is put on generalization capabilities and explainability to highlight the strengths and weaknesses of these detectors. Our work makes an important step towards creating more robust methods for distinguishing between human-written and machine-generated scientific papers, ultimately ensuring the integrity of scientific literature.

Author supplied keywords

Cite

CITATION STYLE

APA

Abdalla, M. H. I., Malberg, S., Dementieva, D., Mosca, E., & Groh, G. (2023). A Benchmark Dataset to Distinguish Human-Written and Machine-Generated Scientific Papers †. Information (Switzerland), 14(10). https://doi.org/10.3390/info14100522

A Benchmark Dataset to Distinguish Human-Written and Machine-Generated Scientific Papers †

Abstract

Author supplied keywords

Cite

Register to see more suggestions