Plagiarism detection without reference collections

73Citations
Citations of this article
83Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Current research in the field of automatic plagiarism detection for text documents focuses on the development of algorithms that compare suspicious documents against potential original documents. Although recent approaches perform well in identifying copied or even modified passages (Brin et al. (1995), Stein (2005)), they assume a closed world where a reference collection must be given (Finkel (2002)). Recall that a human reader can identify suspicious passages within a document without having a library of potential original documents in mind. This raises the question whether plagiarized passages within a document can be detected automatically if no reference is given, e. g. if the plagiarized passages stem from a book that is not available in digital form. This paper contributes right here; it proposes a method to identify potentially plagiarized passages by analyzing a single document with respect to changes in writing style. Such passages then can be used as a starting point for an Internet search for potential sources. As well as that, such passages can be preselected for inspection by a human referee. Among others, we will present new style features that can be computed efficiently and which provide highly discriminative information: Our experiments, which base on a test corpus that will be published, show encouraging results.

Cite

CITATION STYLE

APA

Meyer Zu Eissen, S., Stein, B., & Kulig, M. (2007). Plagiarism detection without reference collections. In Studies in Classification, Data Analysis, and Knowledge Organization (pp. 359–366). Kluwer Academic Publishers. https://doi.org/10.1007/978-3-540-70981-7_40

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free