Parallel and Distributed Code Clone Detection using Sequential Pattern Mining

  • El-Matarawy A
  • El-Ramly M
  • Bahgat R
N/ACitations
Citations of this article
8Readers
Mendeley users who have this article in their library.

Abstract

This research presents a parallel and distributed data mining approach to code clone detection. It aims to prove the value and importance of deploying parallel and distributed computing for real-time large scale code clone detection. It is implemented this approach in a family of clone detectors, called PD EgyCD (Parallel and Distributed Egypt Clone Detector). In this approach, This research builds on an earlier work of the authors for code clone and plagiarism detection using sequential pattern mining by adding parallelism and distribution to our earlier tool EgyCD. Our approach uses data mining through a tailored Apriori-based algorithm for code clone detection. And it uses parallelization and distribution to achieve excellent performance to scale up to clone detection on very large systems. This approach has been implemented as a database application which leverages the capabilities of modern database tools. Two versions have been developed of this distributed technique. The first one uses client-server technique in which all clients and the server deal with only one database. The second one uses agents where each client acts as a separate agent and has its own database and after working on a sub-problem, it submits its partial solution to the server to finally get the complete solution (set of code clones). Experiments show that agents technique is faster than client-server one. Distribution enhances performance very much. Speed improvement is a function of the number of clients/agents used. Our conclusion is that data mining, combined with parallel and distributed computing, can efficiently be deployed for code clone detection of very large systems.

Cite

CITATION STYLE

APA

El-Matarawy, A., El-Ramly, M., & Bahgat, R. (2013). Parallel and Distributed Code Clone Detection using Sequential Pattern Mining. International Journal of Computer Applications, 62(10), 25–31. https://doi.org/10.5120/10118-4792

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free