Text similarity based on data compression in Arabic

11Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.
Get full text

Abstract

With the huge amount of online and offline written data, plagiarism detection has become an eminent need for various fields of science and knowledge. Various context based plagiarism detection methods have been published in the literature. This paper, tries to develop a new plagiarism detection methods using text similarity for Arabic language text with 150 documents and 330 paragraphs (159 from the source document and 171 from Al-Khaleej corpus). The findings of the study show that the similarity measurement based on Lempel Ziv comparison algorithms is very efficient for the plagiarized part of the Arabic text documents with a successful rate of 71.42%. Future studies can improve the efficiency of the algorithms by combining more sophisticated computation, statistical and linguistics hybrid detection methods. © Springer-Verlag Berlin Heidelberg 2014.

Cite

CITATION STYLE

APA

Soori, H., Prilepok, M., Platos, J., Berhan, E., & Snasel, V. (2014). Text similarity based on data compression in Arabic. In Lecture Notes in Electrical Engineering (Vol. 282 LNEE, pp. 211–220). Springer Verlag. https://doi.org/10.1007/978-3-642-41968-3_22

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free