Multilingual plagiarism detection

Zdenek Ceska; Michal Toman; Karel Jezek

Conference Proceedings

Multilingual plagiarism detection

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2008) 5253 LNAI 83-92

DOI: 10.1007/978-3-540-85776-1_8

50Citations

44Readers

Get full text

Abstract

Multilingual text processing has been gaining more and more attention in recent years. This trend has been accentuated by the global integration of European states and the vanishing cultural and social boundaries. Multilingual text processing has become an important field bringing a lot of new and interesting problems. This paper describes a novel approach to multilingual plagiarism detection. We propose a new method called MLPlag for plagiarism detection in multilingual environment. This method is based on analysis of word positions. It utilizes the EuroWordNet thesaurus which transforms words into language independent form. This allows to identify documents plagiarized from sources written in other languages. Special techniques, such as semantic-based word normalization, were incorporated to refine our method. It identifies the replacement of synonyms used by plagiarists to hide the document match. We performed and evaluated our experiments on monolingual and multilingual corpora and results are presented in this paper. © Springer-Verlag Berlin Heidelberg 2008.

Author supplied keywords

Cite

CITATION STYLE

APA

Ceska, Z., Toman, M., & Jezek, K. (2008). Multilingual plagiarism detection. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5253 LNAI, pp. 83–92). https://doi.org/10.1007/978-3-540-85776-1_8

Multilingual plagiarism detection

Abstract

Author supplied keywords

Cite

Register to see more suggestions