A fully consistent hidden semi-markov model-based speech recognition system

Keiichiro Oura; Heiga Zen; Yoshihiko Nankaku; Akinobu Lee; Keiichi Tokuda

Journal ArticleOPEN ACCESS

A fully consistent hidden semi-markov model-based speech recognition system

IEICE Transactions on Information and Systems (2008) E91-D(11) 2693-2700

DOI: 10.1093/ietisy/e91-d.11.2693

11Citations

8Readers

Abstract

In a hidden Markov model (HMM), state duration probabilities decrease exponentially with time, which fails to adequately represent the temporal structure of speech. One of the solutions to this problem is integrating state duration probability distributions explicitly into the HMM. This form is known as a hidden semi-Markov model (HSMM). However, though a number of attempts to use HSMMs in speech recognition systems have been proposed, they are not consistent because various approximations were used in both training and decoding. By avoiding these approximations using a generalized forward-backward algorithm, a context-dependent duration modeling technique and weighted finite-state transducers (WFSTs), we construct a fully consistent HSMM-based speech recognition system. In a speaker-dependent continuous speech recognition experiment, our system achieved about 9.1% relative error reduction over the corresponding HMM-based system. Copyright © 2008 The Institute of Electronics, Information and Communication Engineers.

Author supplied keywords

Cite

CITATION STYLE

APA

Oura, K., Zen, H., Nankaku, Y., Lee, A., & Tokuda, K. (2008). A fully consistent hidden semi-markov model-based speech recognition system. IEICE Transactions on Information and Systems, E91-D(11), 2693–2700. https://doi.org/10.1093/ietisy/e91-d.11.2693

A fully consistent hidden semi-markov model-based speech recognition system

Abstract

Author supplied keywords

Cite

Register to see more suggestions