A depthwise separable convolutional neural network for keyword spotting on an embedded system

Peter Mølgaard Sørensen; Bastian Epp; Tobias May

Journal ArticleOPEN ACCESS

A depthwise separable convolutional neural network for keyword spotting on an embedded system

Eurasip Journal on Audio, Speech, and Music Processing (2020) 2020(1)

DOI: 10.1186/s13636-020-00176-2

16Citations

40Readers

Abstract

A keyword spotting algorithm implemented on an embedded system using a depthwise separable convolutional neural network classifier is reported. The proposed system was derived from a high-complexity system with the goal to reduce complexity and to increase efficiency. In order to meet the requirements set by hardware resource constraints, a limited hyper-parameter grid search was performed, which showed that network complexity could be drastically reduced with little effect on classification accuracy. It was furthermore found that quantization of pre-trained networks using mixed and dynamic fixed point principles could reduce the memory footprint and computational requirements without lowering classification accuracy. Data augmentation techniques were used to increase network robustness in unseen acoustic conditions by mixing training data with realistic noise recordings. Finally, the system’s ability to detect keywords in a continuous audio stream was successfully demonstrated.

Author supplied keywords

Cite

CITATION STYLE

APA

Sørensen, P. M., Epp, B., & May, T. (2020). A depthwise separable convolutional neural network for keyword spotting on an embedded system. Eurasip Journal on Audio, Speech, and Music Processing, 2020(1). https://doi.org/10.1186/s13636-020-00176-2

A depthwise separable convolutional neural network for keyword spotting on an embedded system

Abstract

Author supplied keywords

Cite

Register to see more suggestions