A Large-Gap Clone Detection Approach Using Sequence Alignment via Dynamic Parameter Optimization

16Citations
Citations of this article
21Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Large-gap clones, a kind of clones that reuses code with many edits, are very common in software development practice and widespread in software systems. The detection of such clones is very crucial. However, due to a large number of edits, most of the current work fails to detect such clones effectively. This paper aims to find an effective approach for accurate detection of large-gap clones. We transform the code clone detection problem into a biological sequence alignment question and propose a novel approach that combines code fingerprint with sequence alignment. The sequence alignment is Smith-Waterman algorithm based, but shows significant improvements using dynamic parameter acquisition strategy. Furthermore, we design new rational criteria for clone identification. The proposed approach is automatically evaluated extensively by more than 10 million lines of code for general clones detection. We further conduct an empirical study on five large-scale Java projects to manually measure the approach for large-gap clones detection. The experimental results show that the proposed approach can effectively detect large-gap clones and exhibit good performance, and at the same time remains the competitiveness with existing advanced detection tools in detecting general clone detection.

Cite

CITATION STYLE

APA

Liu, J., Wang, T., Feng, C., Wang, H., & Li, D. (2019). A Large-Gap Clone Detection Approach Using Sequence Alignment via Dynamic Parameter Optimization. IEEE Access, 7, 131270–131281. https://doi.org/10.1109/ACCESS.2019.2940710

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free