Korean documents copy detection based on Ferret

Byung Ryul Ahn; Won Gyum Kim; Won Young Yu; Moon Hyun Kim

Conference Proceedings

Korean documents copy detection based on Ferret

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2011) 6838 LNCS 440-447

DOI: 10.1007/978-3-642-24728-6_60

1Citations

2Readers

Get full text

Abstract

With the development of electronic documents, plagiarism is rapidly increasing and, given the difficulty of manual detection, need for plagiarism detection systems to help protect intellectual property has emerged. Many content-based detection systems have been developed and are actually used in some foreign countries, but they are still insufficient for documents in Korean. In particular, the high variance of Hangul makes the development of detection systems more difficult. This study proposes a Hangul document detection method based on Ferret's trigrams. Ferret only considered the frequency of trigram matches as a way to detect similarity, but in this study the system is developed further by weighting results depending on the degree of trigram match, thereby improving the accuracy of similarity detection. © 2011 Springer-Verlag.

Author supplied keywords

Intelligent Computing in Pattern Recognition

Cite

CITATION STYLE

APA

Ahn, B. R., Kim, W. G., Yu, W. Y., & Kim, M. H. (2011). Korean documents copy detection based on Ferret. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6838 LNCS, pp. 440–447). https://doi.org/10.1007/978-3-642-24728-6_60

Korean documents copy detection based on Ferret

Abstract

Author supplied keywords

Cite

Register to see more suggestions