LEARNAE: Distributed and resilient deep neural network training for heterogeneous peer to peer topologies

2Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Learnae is a framework proposal for decentralized training of Deep Neural Networks (DNN). The main priority of Learnae is to maintain a fully distributed architecture, where no participant has any kind of coordinating role. This solid peer-to-peer concept covers all aspects: Underlying network protocols, data acquiring/distribution and model training. The result is a resilient DNN training system with no single point of failure. Learnae focuses on use cases where infrastructure heterogeneity and network unreliability result to an always changing environment of commodity-hardware nodes. In order to achieve this level of decentralization, new technologies had to be utilized. The main pillars of this implementation are the ongoing projects of IPFS and IOTA. IPFS is a platform for a purely decentralized filesystem, where each node contributes local data storage. IOTA aims to be the networking infrastructure of the upcoming IoT reality. On top of these, we propose a management algorithm for training a DNN model collaboratively, by optimal exchange of data and model weights, always using distribution-friendly gossip protocols.

Cite

CITATION STYLE

APA

Nikolaidis, S., & Refanidis, I. (2019). LEARNAE: Distributed and resilient deep neural network training for heterogeneous peer to peer topologies. In Communications in Computer and Information Science (Vol. 1000, pp. 286–298). Springer Verlag. https://doi.org/10.1007/978-3-030-20257-6_24

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free