An Approach for Plagiarism Detection in Learning Resources

Tran Thanh Dien; Huynh Ngoc Han; Nguyen Thai-Nghe

Conference Proceedings

An Approach for Plagiarism Detection in Learning Resources

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2019) 11814 LNCS 722-730

DOI: 10.1007/978-3-030-35653-8_52

2Citations

3Readers

Get full text

Abstract

Plagiarism detection problem has been taken into account both individuals and organizations. This problem can be used to detect the copy of documents, e.g., publications, books, theses, and more. There are many approaches that have been proposed for plagiarism detection and they work well for English. Different countries may use different languages, thus, natural language processing (e.g. processing of acute accent, circumflex accent, etc.) as well as semantic or order of the words are still challenging. This work proposes an approach for plagiarism detection, especially for Vietnamese documents in learning/researching resources. The input data were pre-processed, extracted, vectorized and represented in term of TF-IDF. Then, Cosine similarity and word-order similarity of the documents are computed. Finally, an ensemble of these similarities is combined. Experimental results on a Vietnamese journal dataset show that the proposed approach is feasibility.

Author supplied keywords

Cite

CITATION STYLE

APA

Dien, T. T., Han, H. N., & Thai-Nghe, N. (2019). An Approach for Plagiarism Detection in Learning Resources. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11814 LNCS, pp. 722–730). Springer. https://doi.org/10.1007/978-3-030-35653-8_52

An Approach for Plagiarism Detection in Learning Resources

Abstract

Author supplied keywords

Cite

Register to see more suggestions