Dynamic delay based cyclic gradient update method for distributed training

Wenhui Hu; Peng Wang; Qigang Wang; Zhengdong Zhou; Hui Xiang; Mei Li; Zhongchao Shi

Conference ProceedingsOPEN ACCESS

Dynamic delay based cyclic gradient update method for distributed training

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 11258 LNCS 550-559

DOI: 10.1007/978-3-030-03338-5_46

1Citations

3Readers

Abstract

Distributed training performance is constrained by two factors. One is the communication overhead between parameter servers and workers. The other is the unbalanced computing powers across workers. We propose a dynamic delay based cyclic gradient update method, which allows workers to push gradients to parameter servers in a round-robin order with dynamic delays. Stale gradient information is accumulated locally in each worker. When a worker obtains the token to update gradients, the accumulated gradients are pushed to parameter servers. Experiments show that, compared with the previous synchronous and cyclic gradient update methods, the dynamic delay cyclic method converges to the same accuracy at a faster speed.

Author supplied keywords

Cite

CITATION STYLE

APA

Hu, W., Wang, P., Wang, Q., Zhou, Z., Xiang, H., Li, M., & Shi, Z. (2018). Dynamic delay based cyclic gradient update method for distributed training. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11258 LNCS, pp. 550–559). Springer Verlag. https://doi.org/10.1007/978-3-030-03338-5_46

Dynamic delay based cyclic gradient update method for distributed training

Abstract

Author supplied keywords

Cite

Register to see more suggestions