Modelling contextual variations of phones is widely accepted as an important aspect of a continuous speech recognition system, and HMM distribution clustering has been sucessfully used to obtain robust models of context through distribution tying. However, as systems move to the challenge of spontaneous speech, temporal variation also becomes important. This paper describes a method for designing HMM topologies that learn both temporal and contextual variation, extending previous work on successive state splitting (SSS). The new approach uses a maximum likelihood criterion consistently at each step, overcoming the previous SSS limitation to speaker-dependent training. Initial experiments show both performance gains and training cost reduction over SSS with the reformulated algorithm. © 1997 Academic Press Limited.
CITATION STYLE
Ostendorf, M., & Singer, H. (1997). HMM topology design using maximum likelihood successive state splitting. Computer Speech and Language, 11(1), 17–41. https://doi.org/10.1006/csla.1996.0021
Mendeley helps you to discover research relevant for your work.