HYPLAG: Hybrid arabic text plagiarism detection system

9Citations
Citations of this article
15Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Plagiarism is specifically defined as literary theft of paragraphs or sentences from unreferenced source. This unauthorized behavior is a real problem that targets scientific research scope. This paper proposes a Hybrid Arabic Plagiarism Detection System (HYPLAG). The HYPLAG approach combines corpus-based and knowledge-based approaches by utilizing an Arabic semantic resource (Arabic WordNet). A preliminary study on texts from undergraduate students was conducted to understand their behavior and the patterns used in plagiarism. The results of the study show that students apply different techniques to plagiarized sentences, also it shows changes in sentence’s components (verbs, nouns, and adjectives). HYPLAG was evaluated on the ExAraPlagDet-2015 dataset against several other approaches that participated in the AraPlagDet PAN@FIRE shared task on Extrinsic Arabic plagiarism detection obtaining a higher performance (F-score 89% vs. 84% obtained by the best performing system at AraPlagDet) with less computational time.

Cite

CITATION STYLE

APA

Ghanem, B., Arafeh, L., Rosso, P., & Sánchez-Vega, F. (2018). HYPLAG: Hybrid arabic text plagiarism detection system. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10859 LNCS, pp. 315–323). Springer Verlag. https://doi.org/10.1007/978-3-319-91947-8_33

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free