UsingWord embedding for cross-language plagiarism detection

32Citations
Citations of this article
117Readers
Mendeley users who have this article in their library.

Abstract

This paper proposes to use distributed representation of words (word embeddings) in cross-language textual similarity detection. The main contributions of this paper are the following: (a) we introduce new cross-language similarity detection methods based on distributed representation of words; (b) we combine the different methods proposed to verify their complementarity and finally obtain an overall F1 score of 89.15% for English-French similarity detection at chunk level (88.5% at sentence level) on a very challenging corpus.

Cite

CITATION STYLE

APA

Ferrero, J., Agnes, F., Besacier, L., & Schwab, D. (2017). UsingWord embedding for cross-language plagiarism detection. In 15th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2017 - Proceedings of Conference (Vol. 2, pp. 415–421). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/e17-2066

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free