Heterogeneous Defect Prediction Based on Federated Transfer Learning via Knowledge Distillation

25Citations
Citations of this article
28Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Heterogeneous defect prediction (HDP) aims to predict defect-prone software modules in one project using heterogeneous data collected from other projects. There are two characteristics of defect data: data islands, and data privacy. In this article, we propose a novel Federated Transfer Learning via Knowledge Distillation (FTLKD) approach for HDP, which takes into consideration two characteristics of defect data. Firstly, Shamir sharing technology achieves homomorphic encryption for private data. During subsequent processing and operations, data remains encrypted all the time. Secondly, each participant uses public data to train convolutional neural networks(CNN), the parameters of the pre-trained CNN are transferred to a private model. A small amount of labeled private data fine-tunes the private model. Finally, knowledge distillation realizes the communication between the participants. The average of all softmax output (logits) is used for knowledge distillation to update the private models. Extensive experiments on 9 projects in 3 public databases (NASA, AEEEM and SOFTLAB) show that FTLKD outperforms the related competing methods.

Cite

CITATION STYLE

APA

Wang, A., Zhang, Y., & Yan, Y. (2021). Heterogeneous Defect Prediction Based on Federated Transfer Learning via Knowledge Distillation. IEEE Access, 9, 29530–29540. https://doi.org/10.1109/ACCESS.2021.3058886

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free