This work presents a robust normalization technique by cascading a speech enhancement method followed by a feature vector normalization algorithm. To provide speech enhancement the Spectral Subtraction (SS) algorithm is used; this method reduces the effect of additive noise by performing a subtraction of the noise spectrum estimate over the complete speech spectrum. On the other hand, an empirical feature vector normalization technique known as PD-MEMLIN (PhonemeDependent Multi-Enviroment Models based Linear Normalization) has also shown to be effective. PD-MEMLIN models clean and noisy spaces employing Gaussian Mixture Models (GMMs), and estimates a set of linear compensation transformations to be used to clean the signal. The proper integration of both approaches is studied and the final design, PDMEEMLIN (Phoneme-Dependent Multi-Enviroment Enhanced Models based Linear Normalization), confirms and improves the effectiveness of both approaches. The results obtained show that in very high degraded speech PD-MEEMLIN outperforms the SS by a range between 11.4% and 34.5%, and for PD-MEMLIN by a range between 11.7% and 24.84%. Furthemore, in moderate SNR, i.e. 15 or 20 dB, PD-MEEMLIN is as good as PD-MEMLIN and SS techniques. © Springer-Verlag Berlin Heidelberg 2007.
CITATION STYLE
Hernández, I., García, P., Nolazco, J., Buera, L., & Lleida, E. (2007). Robust automatic speech recognition using PD-MEEMLIN. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4478 LNCS, pp. 1–8). Springer Verlag. https://doi.org/10.1007/978-3-540-72849-8_1
Mendeley helps you to discover research relevant for your work.