This paper proposes a novel neural network architecture for phoneme-based speech recognition. The new architecture is composed of five time-warping sub-networks and an output layer which integrates the sub-networks. Each time-warping sub-network has a different time-warping function embedded between the input layer and the first hidden layer. A time-warping sub-network recognizes the input speech warping the time axis using its time-warping function. The network is called the Time-Warping Neural Network (TWNN). The purpose of this network is to cope with the temporal variability of acoustic-phonetic features. The TWNN demonstrates a higher phoneme recognition accuracy than a baseline recognizer composed of time-delay neural networks with a linear time alignment mechanism. © 1992, Acoustical Society of Japan. All rights reserved.
CITATION STYLE
Aikawa, K. (1992). Phoneme recognition using time-warping neural networks. Journal of the Acoustical Society of Japan (E), 13(6), 395–402. https://doi.org/10.1250/ast.13.395
Mendeley helps you to discover research relevant for your work.