Reduction of computational cost using two-stage deep neural network for training for denoising and sound source identification

Takayuki Morito; Osamu Sugiyama; Satoshi Uemura; Ryosuke Kojima; Kazuhiro Nakadai

Conference Proceedings

Reduction of computational cost using two-stage deep neural network for training for denoising and sound source identification

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2016) 9799 562-573

DOI: 10.1007/978-3-319-42007-3_49

4Citations

7Readers

Get full text

Abstract

This paper addresses reduction of computational cost in training of a Deep Neural Network (DNN), in particular, for sound identification using highly noise-contaminated sound recorded with a microphone array embedded in an Unmanned Aerial Vehicle (UAV), aiming at people’s voice detection quickly and widely in a disastrous situation. It is known that a DNN training method called end-to-end training shows high performance, since it uses a huge neural network with high nonlinearity which is trained with a large amount of raw input signals without preprocessing. Its computational cost is, however, expensive due to the high complexity of the neural network. Therefore, we propose twostage DNN training using two separately-trained networks; denoising of sound sources and sound source identification. Since the huge network is divided into two smaller networks, the complexity of the networks is expected to decrease and each of them can consider a specific model of denoising and identification. This results in faster convergence and computational cost reduction in DNN training. Preliminary results showed that only 71% of training time was necessary with the proposed two staged network, while maintaining the accuracy of sound source identification, compared to end-to-end training using noisy acoustic signals recorded with an 8 ch circular microphone array embedded in a UAV.

Author supplied keywords

Cite

CITATION STYLE

APA

Morito, T., Sugiyama, O., Uemura, S., Kojima, R., & Nakadai, K. (2016). Reduction of computational cost using two-stage deep neural network for training for denoising and sound source identification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9799, pp. 562–573). Springer Verlag. https://doi.org/10.1007/978-3-319-42007-3_49

Reduction of computational cost using two-stage deep neural network for training for denoising and sound source identification

Abstract

Author supplied keywords

Cite

Register to see more suggestions