Software defect prediction can predict the defective modules in the project in advance, which is helpful to optimize the allocation of test resources. Recently, privacy protection for datasets and models has gradually attracted the attention of researchers. In this study, we are the first to apply homomorphic encryption to software defect prediction model construction and propose a novel method HOPE. Specifically, we adopt an algorithm approximation strategy to approximate the sigmoid function and select the Paillier homomorphic encryption algorithm for Logistical regression. In our case study, we choose the MORPH dataset gathered from real-world open-source projects as our experimental subjects. Then we design three control groups to simulate three different scenarios based on whether the client sends the encrypted data to the server and whether the server uses the HOPE method. The final results show that if the server uses the original Logistic regression to construct the model on the encrypted data, the performance of the trained model is similar to random guess, which can guarantee the privacy protection of the data. Moreover, compared with the original Logistical regression method, the method HOPE only needs a small amount of computational cost, but there is no obvious performance decrease. We share our implementation scripts and datasets to encourage researchers to conduct more studies on this research direction.
CITATION STYLE
Yu, C., Ding, Z., & Chen, X. (2021). HOPE: Software Defect Prediction Model Construction Method via Homomorphic Encryption. IEEE Access, 9, 69405–69417. https://doi.org/10.1109/ACCESS.2021.3078265
Mendeley helps you to discover research relevant for your work.