DSFM: Enhancing Functional Code Clone Detection with Deep Subtree Interactions

1Citations
Citations of this article
11Readers
Mendeley users who have this article in their library.

Abstract

Functional code clone detection is important for software maintenance. In recent years, deep learning techniques are introduced to improve the performance of functional code clone detectors. By representing each code snippet as a vector containing its program semantics, syntactically dissimilar functional clones are detected. However, existing deep learning-based approaches attach too much importance to code feature learning, hoping to project all recognizable knowledge of a code snippet into a single vector. We argue that these deep learning-based approaches can be enhanced by considering the characteristics of syntactic code clone detection, where we need to compare the contents of the source code (e.g., intersection of tokens, similar flow graphs, and similar subtrees) to obtain code clones. In this paper, we propose a novel deep learning-based approach named DSFM, which incorporates comparisons between code snippets for detecting functional code clones. Specifically, we improve the typical deep clone detectors with deep subtree interactions that compare every two subtrees extracted abstract syntax trees (ASTs) of two code snippets, thereby introducing more fine-grained semantic similarity. By conducting extensive experiments on three widely-used datasets, GCJ, OJClone, and BigCloneBench, we demonstrate the great potential of deep subtree interactions in code clone detection task. The proposed DSFM outperforms the state-of-the-art approaches, including two traditional approaches, two unsupervised and four supervised deep learning-based baselines.

References Powered by Scopus

Long Short-Term Memory

78157Citations
N/AReaders
Get full text

Learning phrase representations using RNN encoder-decoder for statistical machine translation

11782Citations
N/AReaders
Get full text

Effective approaches to attention-based neural machine translation

4144Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Trident: Detecting SQL Injection Attacks via Abstract Syntax Tree-based Neural Network

0Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Xu, Z., Qiang, S., Song, D., Zhou, M., Wan, H., Zhao, X., … Zhang, H. (2024). DSFM: Enhancing Functional Code Clone Detection with Deep Subtree Interactions. In Proceedings - International Conference on Software Engineering (pp. 2733–2744). IEEE Computer Society. https://doi.org/10.1145/3597503.3639215

Readers' Seniority

Tooltip

Professor / Associate Prof. 3

43%

PhD / Post grad / Masters / Doc 3

43%

Researcher 1

14%

Readers' Discipline

Tooltip

Computer Science 6

86%

Social Sciences 1

14%

Save time finding and organizing research with Mendeley

Sign up for free