A study of low-variance multi-taper features for distributed speech recognition

Md Jahangir Alam; Patrick Kenny; Douglas O'Shaughnessy

Conference Proceedings

A study of low-variance multi-taper features for distributed speech recognition

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2011) 7015 LNAI 239-245

DOI: 10.1007/978-3-642-25020-0_31

9Citations

11Readers

Get full text

Abstract

In this paper we study low-variance multi-taper spectrum estimation methods to compute the mel-frequency cepstral coefficient (MFCC) features for robust speech recognition. In speech recognition, MFCC features are usually computed from a Hamming-windowed DFT spectrum. Although windowing helps in reducing the bias of the spectrum, but variance remains high. Multi-taper spectrum estimation methods can be used to correct the shortcomings of single taper (or window) spectrum estimation methods. Experimental results on the AURORA-2 corpus show that the multi-taper methods, specifically the multi-peak multi-taper method, perform better compared to the Hamming-windowed spectrum estimation method. © 2011 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Alam, M. J., Kenny, P., & O’Shaughnessy, D. (2011). A study of low-variance multi-taper features for distributed speech recognition. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7015 LNAI, pp. 239–245). https://doi.org/10.1007/978-3-642-25020-0_31

A study of low-variance multi-taper features for distributed speech recognition

Abstract

Author supplied keywords

Cite

Register to see more suggestions