Beyond Black Box AI-Generated Plagiarism Detection: From Sentence to Document Level

6Citations
Citations of this article
51Readers
Mendeley users who have this article in their library.

Abstract

The increasing reliance on large language models (LLMs) in academic writing has led to a rise in plagiarism. Existing AI-generated text classifiers have limited accuracy and often produce false positives. We propose a novel approach using natural language processing (NLP) techniques, offering quantifiable metrics at both sentence and document levels for easier interpretation by human evaluators. Our method employs a multi-faceted approach, generating multiple paraphrased versions of a given question and inputting them into the LLM to generate answers. By using a contrastive loss function based on cosine similarity, we match generated sentences with those from the student’s response. Our approach achieves up to 94% accuracy in classifying human and AI text, providing a robust and adaptable solution for plagiarism detection in academic settings. This method improves with LLM advancements, reducing the need for new model training or reconfiguration, and offers a more transparent way of evaluating and detecting AI-generated text.

Cite

CITATION STYLE

APA

Quidwai, M. A., Li, C., & Dube, P. (2023). Beyond Black Box AI-Generated Plagiarism Detection: From Sentence to Document Level. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 727–735). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.bea-1.58

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free