Exploiting contextual information for speech/non-speech detection

Sree Hari Krishnan Parthasarathi; Petr Motlíček; Hynek Hermansky

Conference Proceedings

Exploiting contextual information for speech/non-speech detection

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2008) 5246 LNAI 451-459

DOI: 10.1007/978-3-540-87391-4_58

1Citations

3Readers

Get full text

Abstract

In this paper, we investigate the effect of temporal context for speech/ non-speech detection (SND). It is shown that even a simple feature such as full-band energy, when employed with a large-enough context, shows promise for further investigation. Experimental evaluations on the test data set, with a state-of-the-art multi-layer perceptron based SND system and a simple energy threshold based SND method, using the F-measure, show an absolute performance gain of 4.4% and 5.4% respectively. The optimal contextual length was found to be 1000 ms. Further numerical optimizations yield an improvement (3.37% absolute), resulting in an absolute gain of 7.77% and 8.77% over the MLP based and energy based methods respectively. ROC based performance evaluation also reveals promising performance for the proposed method, particularly in low SNR conditions. © 2008 Springer-Verlag Berlin Heidelberg.

Author supplied keywords

Cite

CITATION STYLE

APA

Krishnan Parthasarathi, S. H., Motlíček, P., & Hermansky, H. (2008). Exploiting contextual information for speech/non-speech detection. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5246 LNAI, pp. 451–459). https://doi.org/10.1007/978-3-540-87391-4_58

Exploiting contextual information for speech/non-speech detection

Abstract

Author supplied keywords

Cite

Register to see more suggestions