In this paper we study low-variance multi-taper spectrum estimation methods to compute the mel-frequency cepstral coefficient (MFCC) features for robust speech recognition. In speech recognition, MFCC features are usually computed from a Hamming-windowed DFT spectrum. Although windowing helps in reducing the bias of the spectrum, but variance remains high. Multi-taper spectrum estimation methods can be used to correct the shortcomings of single taper (or window) spectrum estimation methods. Experimental results on the AURORA-2 corpus show that the multi-taper methods, specifically the multi-peak multi-taper method, perform better compared to the Hamming-windowed spectrum estimation method. © 2011 Springer-Verlag.
CITATION STYLE
Alam, M. J., Kenny, P., & O’Shaughnessy, D. (2011). A study of low-variance multi-taper features for distributed speech recognition. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7015 LNAI, pp. 239–245). https://doi.org/10.1007/978-3-642-25020-0_31
Mendeley helps you to discover research relevant for your work.