Experimenting With Normalization Layers in Federated Learning on Non-IID Scenarios

1Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Training Deep Learning (DL) models require large, high-quality datasets, often assembled with data from different institutions. Federated Learning (FL) has been emerging as a method for privacy-preserving pooling of datasets employing collaborative training from different institutions by iteratively globally aggregating locally trained models. One critical performance challenge of FL is operating on datasets not independently and identically distributed (non-IID) among the federation participants. Even though this fragility cannot be eliminated, it can be debunked by a suitable optimization of two hyper-parameters: layer normalization methods and collaboration frequency selection. In this work, we benchmark five different normalization layers for training Neural Networks (NNs), two families of non-IID data skew, and two datasets. Results show that Batch Normalization, widely employed for centralized DL, is not the best choice for FL, whereas Group and Layer Normalization consistently outperform Batch Normalization, with a performance gain of up to about 15 % in the most challenging non-IID scenario. Similarly, frequent model aggregation decreases convergence speed and mode quality.

Cite

CITATION STYLE

APA

Casella, B., Esposito, R., Sciarappa, A., Cavazzoni, C., & Aldinucci, M. (2024). Experimenting With Normalization Layers in Federated Learning on Non-IID Scenarios. IEEE Access, 12, 47961–47971. https://doi.org/10.1109/ACCESS.2024.3383783

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free