Detection of Text Similarity for Indication Plagiarism Using Winnowing Algorithm Based K-gram and Jaccard Coefficient

9Citations
Citations of this article
30Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

One of the digital data is a document. Documents can be easily copied and deleted. Anyone can retype or copy parts of the document. In this paper will detect text similarity. The more similarity of words there is the more indicated the document is plagiarism. Winnowing algorithm performs the calculation of hash values of each k-gram. This method improves the search time with more accuracy in the detection process. All data selected hash values will be fingerprints of a document. Fingerprint will be used as a basis for comparing similarities between text data. The fingerprint value of the winnowing process for each document will be matched by using the Jaccard Coefficient to measure the similarity of the text. In this paper results show that the adjustment of the k-gram and window values can affect the final result of the similarity percentage value. The smaller the k-gram value, the greater the percentage value.

Cite

CITATION STYLE

APA

Puspaningrum, E. Y., Nugroho, B., Setiawan, A., & Hariyanti, N. (2020). Detection of Text Similarity for Indication Plagiarism Using Winnowing Algorithm Based K-gram and Jaccard Coefficient. In Journal of Physics: Conference Series (Vol. 1569). IOP Publishing Ltd. https://doi.org/10.1088/1742-6596/1569/2/022044

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free