This paper describes the University of Sheffield entry for the 2nd international plagiarism detection competition (PAN 2010). Our system attempts to identify extrinsic plagiarism. A three-stage approach is used: pre-processing, candidate document selection (using word n-grams) and detailed analysis (using the Running Karp-Rabin Greedy String Tiling string matching algorithm). This approach achieved an overall performance of 0.20 in the official evaluation with a precision of 0.40, recall of 0.16 and granularity of 1.21.
CITATION STYLE
Nawab, R. M. A., Stevenson, M., & Clough, P. (2010). University of Sheffield: Lab report for PAN at CLEF 2010. In CEUR Workshop Proceedings (Vol. 1176). CEUR-WS.
Mendeley helps you to discover research relevant for your work.