A veraging is probably not the optimum way of aggregating parameters in federated learning

42Citations
Citations of this article
43Readers
Mendeley users who have this article in their library.

Abstract

Federated learning is a decentralized topology of deep learning, that trains a shared model through data distribute damongeachclient(like mobile phones,wear able devices),inordertoensuredata privacy by avoiding raw data exposed in data center (server). After each client computes a new model parameter by stochastic gradient descent (SGD) based on their own local data, these locally-computed parameters will be aggregated to generate a nup dated global model. Many current state-of-the-artstudies aggregate different client-computed parameters by averaging them, but none theoretically explains why averaging parameters is a good approach. In this paper, we treat each client computed parameter as a random vector because of the stochastic properties of SGD, and estimate mutual information between two client computed parameters at different training phases using two methods in two learning tasks. The results confirm the correlation between different clients and show an increasing trend of mutual information with training iteration. However, when we further compute the distance between client computed parameters, we find that parameters are getting more correlated while not getting closer. This phenomenon suggests that averaging parameters may not be the optimum way of aggregating trained parameters.

Cite

CITATION STYLE

APA

Xiao, P., Cheng, S., Stankovic, V., & Vukobratovic, D. (2020). A veraging is probably not the optimum way of aggregating parameters in federated learning. Entropy, 22(3). https://doi.org/10.3390/e22030314

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free