In this letter, authors propose an auditory feature representation technique with the filterbank learned using an annealing dropout convolutional restricted Boltzmann machine (ConvRBM) and noise-robust energy estimation using the Teager energy operator (TEO). TEO is applied on each subband of ConvRBM filterbank and pooled later to get the short-term spectral features. Experiments on AURORA 4 database show that the proposed features perform better than the Mel filterbank features. The relative improvement of 2.59%–11.63% and 1.26%–6.87% in word error rate is achieved using the time delay neural network and the bidirectional long short-term memory models, respectively.
CITATION STYLE
Sailor, H. B., & Patil, H. A. (2017). Auditory feature representation using convolutional restricted Boltzmann machine and Teager energy operator for speech recognition. The Journal of the Acoustical Society of America, 141(6), EL500–EL506. https://doi.org/10.1121/1.4983751
Mendeley helps you to discover research relevant for your work.