Deep Learning Approaches for Voice Activity Detection

7Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper is involved with robustness for voice activity detection (VAD) approaches. The proposed approaches employ a few short term speech/non-speech discriminating characteristics to obtain a satisfactory performance in different environments. This paper mainly focuses on the performance improvement of recently proposed approaches which utilize spectral peak valley difference (SPVD) as a silence detection feature. The primary problem of this paper is to use a set of features with SPVD to improve the VAD robustness. The proposed approaches use deep learning approaches which are DNN, RNN and CNN, in order to analyze the robust VAD systems of the noise. The experiments show that the proposed deep learning approaches are compared with some other VAD techniques for better demonstration of its results in various noise and different SNRs circumstances. Applying the proposed approaches, the average of VAD performances are improved respectively to 89.72%, 95.01%, 92.05% for 5 diverse noise types. The result of LSTM performance is even 10.29% over than the method based on DNN and also 7.96% over than the CNN.

Author supplied keywords

Cite

CITATION STYLE

APA

Wang, M., Huang, Q., Zhang, J., Li, Z., Pu, H., Lei, J., & Wang, L. (2020). Deep Learning Approaches for Voice Activity Detection. In Advances in Intelligent Systems and Computing (Vol. 928, pp. 816–826). Springer Verlag. https://doi.org/10.1007/978-3-030-15235-2_110

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free