Abstract
Software plagiarism cheats students out of their own education and leads to unfair grading, making software plagiarism detection an important problem. However, many popular plagiarism detection tools are inaccurate, language-specific, or closed source, limiting their applicability. In this work, we seek to address these problems via a novel approach. We adapt the optimal Smith-Waterman sequence alignment algorithm to precisely measure the similarity between programs, greatly improving detection accuracy relative to competitors. Our approach is applicable to any language describable by an ANTLR grammar, which includes most programming languages. We also provide a new type of evaluation based on random program generation and obfuscation. Finally, we make our approach freely available, allowing for customizations and transparent reasoning about detection behavior.
Cite
CITATION STYLE
Nichols, L., Dewey, K., Emre, M., Chen, S., & Hardekopf, B. (2019). Syntax-based Improvements to Plagiarism Detectors and their Evaluations. In Annual Conference on Innovation and Technology in Computer Science Education, ITiCSE (pp. 555–561). Association for Computing Machinery. https://doi.org/10.1145/3304221.3319789
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.