This chapter describes a use of recurrent neural networks (ie, feedback is incorporated in the computation) as an acoustic model for continuous speech recognition. The form of the recurrent neural network is described, along with an appropriate parameter estimation procedure. For each frame of acoustic data, the recurrent network generates an estimate of the posterior probability of the possible phones given the observed acoustic signal. The posteriors are then converted into scaled likelihoods and used as the observation probabilities within a conventional decoding paradigm (eg, Viterbi decoding). The advantages of the using recurrent networks are that they require a small number of parameters and provide a fast decoding capability (relative to conventional large vocabulary HMM systems).
CITATION STYLE
Robinson, T., Hochberg, M., & Renals, S. (1996). The Use of Recurrent Neural Networks in Continuous Speech Recognition (pp. 233–258). https://doi.org/10.1007/978-1-4613-1367-0_10
Mendeley helps you to discover research relevant for your work.