Plagiarism is a common problem faced by academia and education. Mature commercial plagiarism detection system has the advantages of comprehensive and high accuracy, but the expensive detection costs make it unsuitable for real-time, lightweight application environment such as the student assignments plagiarism detection. This paper introduces the method of extending classic Winnowing plagiarism detection algorithm, expands the algorithm in functionality. The extended algorithm can retain the text location and length information in original document while extracting the fingerprints of a document, so that the locating and marking for plagiarism text fragment are much easier to achieve. The experimental results and several years of running practice show that the expansion of the algorithm has little effect on its performance, normal hardware configuration of PC will be able to meet small and medium-sized applications requirements. Based on the characteristics of lightweight, high efficiency, reliability and flexibility of Winnowing, the extended algorithm further enhances the adaptability and extends the application areas.
Duan, X., Wang, M., & Mu, J. (2017). A Plagiarism Detection Algorithm based on Extended Winnowing. MATEC Web of Conferences, 128, 02019. https://doi.org/10.1051/matecconf/201712802019