The joint optimization of spectro-temporal features and neural net classifiers

György Kovács; László Tóth

Conference Proceedings

The joint optimization of spectro-temporal features and neural net classifiers

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2013) 8082 LNAI 552-559

DOI: 10.1007/978-3-642-40585-3_69

3Citations

5Readers

Get full text

Abstract

In speech recognition, spectro-temporal feature extraction and the training of the acoustical model are usually performed separately. To improve recognition performance, we present a combined model which allows the training of the feature extraction filters along with a neural net classifier. Besides expecting that this joint training will result in a better recognition performance, we also expect that such a neural net can generate coefficients for spectro-temporal filters and also enhance preexisting ones, such as those obtained with the two-dimensional Discrete Cosine Transform (2D DCT) and Gabor filters. We tested these assumptions on the TIMIT phone recognition task. The results show that while the initialization based on the 2D DCT or Gabor coefficients is better in some cases than with simple random initialization, the joint model in practice always outperforms the standard two-step method. Furthermore, the results can be significantly improved by using a convolutional version of the network. © 2013 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Kovács, G., & Tóth, L. (2013). The joint optimization of spectro-temporal features and neural net classifiers. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8082 LNAI, pp. 552–559). https://doi.org/10.1007/978-3-642-40585-3_69

The joint optimization of spectro-temporal features and neural net classifiers

Abstract

Author supplied keywords

Cite

Register to see more suggestions