Gated Convolutional LSTM for Speech Commands Recognition

9Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

As the mobile device gaining increasing popularity, Acoustic Speech Recognition on it is becoming a leading application. Unfortunately, the limited battery and computational resources on a mobile device highly restrict the potential of Speech Recognition systems, most of which have to resort to a remote server for better performance. To improve the performance of local Speech Recognition, we propose C-1-G-2-Blstm. This model shares Convolutional Neural Network’s ability of learning local feature and Recurrent Neural Network’s ability of learning sequence data’s long dependence. Furthermore, by adopting the Gated Convolutional Neural Network instead of a traditional CNN, we manage to greatly improve the model’s capacity. Our tests demonstrate that C-1-G-2-Blstm can achieve a high accuracy at 90.6% on the Google Speech Commands data set, which is 6.4% higher than the state-of-art methods.

Cite

CITATION STYLE

APA

Wang, D., Lv, S., Wang, X., & Lin, X. (2018). Gated Convolutional LSTM for Speech Commands Recognition. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10861 LNCS, pp. 669–681). Springer Verlag. https://doi.org/10.1007/978-3-319-93701-4_53

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free