Distributed Machine Learning (DML) is one of the core technologies for Artificial Intelligence (AI). However, in the existing distributed machine learning framework, the data integrity is not taken into account. If network attackers forge the data, modify the data, or destroy the data, the training model in the distributed machine learning system will be greatly affected, and the training results are led to be wrong. Therefore, it is crucial to guarantee the data integrity in the DML. In this paper, we propose a distributed machine learning oriented data integrity verification scheme (DML-DIV) to ensure the integrity of training data. Firstly, we adopt the idea of Provable Data Possession (PDP) sampling auditing algorithm to achieve data integrity verification so that our DML-DIV scheme can resist forgery attacks and tampering attacks. Secondly, we generate a random number, namely blinding factor, and apply the discrete logarithm problem (DLP) to construct proof and ensure privacy protection in the TPA verification process. Thirdly, we employ identity-based cryptography and two-step key generation technology to generate data owner's public/private key pair so that our DML-DIV scheme can solve the key escrow problem and reduce the cost of managing the certificates. Finally, formal theoretical analysis and experimental results show the security and efficiency of our DML-DIV scheme.
CITATION STYLE
Zhao, X. P., & Jiang, R. (2020). Distributed Machine Learning Oriented Data Integrity Verification Scheme in Cloud Computing Environment. IEEE Access, 8, 26372–26384. https://doi.org/10.1109/ACCESS.2020.2971519
Mendeley helps you to discover research relevant for your work.