Single-pass corpus to corpus comparison by sentence hashing

1Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper describes a new algorithm identifying common phrase sequences. The SHAPD2 algorithm was designed to achieve the goal of a single-pass corpus to corpus comparison. It is a highly efficient solution that finds application with considerable amount of data and excels over other approaches. One of its possible applications is the detection of potential plagiarisms by comparing not a document against a corpus, but corpus to corpus. This makes the SHAPD2 algorithm a valuable alternative to the available solutions. © Springer International Publishing Switzerland 2014.

Cite

CITATION STYLE

APA

Ceglarek, D. (2014). Single-pass corpus to corpus comparison by sentence hashing. Studies in Computational Intelligence, 513, 167–176. https://doi.org/10.1007/978-3-319-01787-7_16

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free