Information retrieval techniques for corpus filtering applied to external plagiarism detection

2Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We present a set of approaches for corpus filtering in the context of document external plagiarism detection. Producing filtered sets, and hence limiting the problem's search space, can be a performance improvement and is used today in many real-world applications such as web search engines. With regards to document plagiarism detection, the database of documents to match the suspicious candidate against is potentially fairly large, and hence it becomes very recommendable to apply filtered set generation techniques. The approaches that we have implemented include information retrieval methods and a document similarity measure based on a variant of tf-idf. Furthermore, we perform textual comparisons, as well as a semantic similarity analysis in order to capture higher levels of obfuscation. © 2011 Springer-Verlag.

Cite

CITATION STYLE

APA

Micol, D., Ferrández, Ó., & Muñoz, R. (2011). Information retrieval techniques for corpus filtering applied to external plagiarism detection. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6716 LNCS, pp. 100–111). https://doi.org/10.1007/978-3-642-22327-3_10

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free