Distributed learning of multilingual DNN feature extractors using GPUs

16Citations
Citations of this article
30Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Multilingual deep neural networks (DNNs) can act as deep feature extractors and have been applied successfully to crosslanguage acoustic modeling. Learning these feature extractors becomes an expensive task, because of the enlarged multilingual training data and the sequential nature of stochastic gradient descent (SGD). This paper investigates strategies to accelerate the learning process over multiple GPU cards. We propose the DistModel and DistLang frameworks which distribute feature extractor learning by models and languages respectively. The time-synchronous DistModel has the nice property of tolerating infrequent model averaging. With 3 GPUs, DistModel achieves 2.6× speed-up and causes no loss on word error rates. When using DistLang, we observe better acceleration but worse recognition performance. Further evaluations are conducted to scale DistModel to more languages and GPU cards.

Cite

CITATION STYLE

APA

Miao, Y., Zhang, H., & Metze, F. (2014). Distributed learning of multilingual DNN feature extractors using GPUs. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (pp. 830–834). International Speech and Communication Association. https://doi.org/10.21437/interspeech.2014-211

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free