An important issue faced during software development is to identify defects and the properties of those defects, if found, in a given source file. Determining defectiveness of source code assumes significance due to its implications on software development and maintenance cost. We present a novel system to estimate the presence of defects in source code and detect attributes of the possible defects, such as the severity of defects. The salient elements of our system are: (i) a dataset of newly introduced source code metrics, called PROgramming CONstruct (PROCON) metrics, and (ii) a novel Machine-Learning (ML)-based system, called Defect Estimator for Source Code (DESCo), that makes use of PROCON dataset for predicting defectiveness in a given scenario. The dataset was created by processing 30,400+ source files written in four popular programming languages, viz., C, C++, Java, and Python. The results of our experiments show that DESCo system outperforms one of the state-of-the-art methods with an improvement of 44.9%. To verify the correctness of our system, we compared the performance of 12 different ML algorithms with 50+ different combinations of their key parameters. Our system achieves the best results with SVM technique with a mean accuracy measure of 80.8%.
CITATION STYLE
Kapur, R., & Sodhi, B. (2020). A Defect Estimator for Source Code. ACM Transactions on Software Engineering and Methodology, 29(2). https://doi.org/10.1145/3384517
Mendeley helps you to discover research relevant for your work.