Deep Learning Approaches for Voice Activity Detection

Mantao Wang; Qiang Huang; Jie Zhang; Zhiyong Li; Haibo Pu; Jinglan Lei; Lanjing Wang

Conference Proceedings

Deep Learning Approaches for Voice Activity Detection

Advances in Intelligent Systems and Computing (2020) 928 816-826

DOI: 10.1007/978-3-030-15235-2_110

7Citations

10Readers

Get full text

Abstract

This paper is involved with robustness for voice activity detection (VAD) approaches. The proposed approaches employ a few short term speech/non-speech discriminating characteristics to obtain a satisfactory performance in different environments. This paper mainly focuses on the performance improvement of recently proposed approaches which utilize spectral peak valley difference (SPVD) as a silence detection feature. The primary problem of this paper is to use a set of features with SPVD to improve the VAD robustness. The proposed approaches use deep learning approaches which are DNN, RNN and CNN, in order to analyze the robust VAD systems of the noise. The experiments show that the proposed deep learning approaches are compared with some other VAD techniques for better demonstration of its results in various noise and different SNRs circumstances. Applying the proposed approaches, the average of VAD performances are improved respectively to 89.72%, 95.01%, 92.05% for 5 diverse noise types. The result of LSTM performance is even 10.29% over than the method based on DNN and also 7.96% over than the CNN.

Author supplied keywords

Cite

CITATION STYLE

APA

Wang, M., Huang, Q., Zhang, J., Li, Z., Pu, H., Lei, J., & Wang, L. (2020). Deep Learning Approaches for Voice Activity Detection. In Advances in Intelligent Systems and Computing (Vol. 928, pp. 816–826). Springer Verlag. https://doi.org/10.1007/978-3-030-15235-2_110

Deep Learning Approaches for Voice Activity Detection

Abstract

Author supplied keywords

Cite

Register to see more suggestions