DISTREAL: Distributed Resource-Aware Learning in Heterogeneous Systems

Martin Rapp; Ramin Khalili; Kilian Pfeiffer; Jörg Henkel

Conference ProceedingsOPEN ACCESS

DISTREAL: Distributed Resource-Aware Learning in Heterogeneous Systems

Proceedings of the 36th AAAI Conference on Artificial Intelligence, AAAI 2022 (2022) 36 8062-8071

DOI: 10.1609/aaai.v36i7.20778

15Citations

9Readers

Abstract

We study the problem of distributed training of neural networks (NNs) on devices with heterogeneous, limited, and time-varying availability of computational resources. We present an adaptive, resource-aware, on-device learning mechanism, DISTREAL, which is able to fully and efficiently utilize the available resources on devices in a distributed manner, increasing the convergence speed. This is achieved with a dropout mechanism that dynamically adjusts the computational complexity of training an NN by randomly dropping filters of convolutional layers of the model. Our main contribution is the introduction of a design space exploration (DSE) technique, which finds Pareto-optimal per-layer dropout vectors with respect to resource requirements and convergence speed of the training. Applying this technique, each device is able to dynamically select the dropout vector that fits its available resource without requiring any assistance from the server. We implement our solution in a federated learning (FL) system, where the availability of computational resources varies both between devices and over time, and show through extensive evaluation that we are able to significantly increase the convergence speed over the state of the art without compromising on the final accuracy.

Cite

CITATION STYLE

APA

Rapp, M., Khalili, R., Pfeiffer, K., & Henkel, J. (2022). DISTREAL: Distributed Resource-Aware Learning in Heterogeneous Systems. In Proceedings of the 36th AAAI Conference on Artificial Intelligence, AAAI 2022 (Vol. 36, pp. 8062–8071). Association for the Advancement of Artificial Intelligence. https://doi.org/10.1609/aaai.v36i7.20778

DISTREAL: Distributed Resource-Aware Learning in Heterogeneous Systems

Abstract

Cite

Register to see more suggestions