In this article, we present a new voice activity detection (VAD) algorithm that is based on statistical models and empirical rule-based energy detection algorithm. Specifically, it needs two steps to separate speech segments from background noise. For the first step, the VAD detects possible speech endpoints efficiently using the empirical rule-based energy detection algorithm. However, the possible endpoints are not accurate enough when the signal-to-noise ratio is low. Therefore, for the second step, we propose a new gaussian mixture model-based multiple-observation log likelihood ratio algorithm to align the endpoints to their optimal positions. Several experiments are conducted to evaluate the proposed VAD on both accuracy and efficiency. The results show that it could achieve better performance than the six referenced VADs in various noise scenarios.
CITATION STYLE
Wu, J., & Zhang, X.-L. (2011). An efficient voice activity detection algorithm by combining statistical model and energy detection. EURASIP Journal on Advances in Signal Processing, 2011(1). https://doi.org/10.1186/1687-6180-2011-18
Mendeley helps you to discover research relevant for your work.