HYPLAG: Hybrid arabic text plagiarism detection system

Bilal Ghanem; Labib Arafeh; Paolo Rosso; Fernando Sánchez-Vega

Conference Proceedings

HYPLAG: Hybrid arabic text plagiarism detection system

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 10859 LNCS 315-323

DOI: 10.1007/978-3-319-91947-8_33

9Citations

15Readers

Get full text

Abstract

Plagiarism is specifically defined as literary theft of paragraphs or sentences from unreferenced source. This unauthorized behavior is a real problem that targets scientific research scope. This paper proposes a Hybrid Arabic Plagiarism Detection System (HYPLAG). The HYPLAG approach combines corpus-based and knowledge-based approaches by utilizing an Arabic semantic resource (Arabic WordNet). A preliminary study on texts from undergraduate students was conducted to understand their behavior and the patterns used in plagiarism. The results of the study show that students apply different techniques to plagiarized sentences, also it shows changes in sentence’s components (verbs, nouns, and adjectives). HYPLAG was evaluated on the ExAraPlagDet-2015 dataset against several other approaches that participated in the AraPlagDet PAN@FIRE shared task on Extrinsic Arabic plagiarism detection obtaining a higher performance (F-score 89% vs. 84% obtained by the best performing system at AraPlagDet) with less computational time.

Author supplied keywords

Cite

CITATION STYLE

APA

Ghanem, B., Arafeh, L., Rosso, P., & Sánchez-Vega, F. (2018). HYPLAG: Hybrid arabic text plagiarism detection system. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10859 LNCS, pp. 315–323). Springer Verlag. https://doi.org/10.1007/978-3-319-91947-8_33

HYPLAG: Hybrid arabic text plagiarism detection system

Abstract

Author supplied keywords

Cite

Register to see more suggestions