A fully consistent hidden semi-markov model-based speech recognition system

11Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.

Abstract

In a hidden Markov model (HMM), state duration probabilities decrease exponentially with time, which fails to adequately represent the temporal structure of speech. One of the solutions to this problem is integrating state duration probability distributions explicitly into the HMM. This form is known as a hidden semi-Markov model (HSMM). However, though a number of attempts to use HSMMs in speech recognition systems have been proposed, they are not consistent because various approximations were used in both training and decoding. By avoiding these approximations using a generalized forward-backward algorithm, a context-dependent duration modeling technique and weighted finite-state transducers (WFSTs), we construct a fully consistent HSMM-based speech recognition system. In a speaker-dependent continuous speech recognition experiment, our system achieved about 9.1% relative error reduction over the corresponding HMM-based system. Copyright © 2008 The Institute of Electronics, Information and Communication Engineers.

References Powered by Scopus

From HMM's to segment models: A unified view of stochastic modeling for speech recognition

449Citations
N/AReaders
Get full text

Continuously variable duration hidden Markov models for automatic speech recognition

353Citations
N/AReaders
Get full text

A hidden semi-Markov model-based speech synthesis system

195Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Automatic detection of volcano-seismic events by modeling state and event duration in hidden Markov models

21Citations
N/AReaders
Get full text

Hidden semi-Markov Model based earthquake classification system using Weighted Finite-State Transducers

20Citations
N/AReaders
Get full text

A sketch interface for robust and natural robot control

11Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Oura, K., Zen, H., Nankaku, Y., Lee, A., & Tokuda, K. (2008). A fully consistent hidden semi-markov model-based speech recognition system. IEICE Transactions on Information and Systems, E91-D(11), 2693–2700. https://doi.org/10.1093/ietisy/e91-d.11.2693

Readers' Seniority

Tooltip

Researcher 4

57%

Professor / Associate Prof. 2

29%

PhD / Post grad / Masters / Doc 1

14%

Readers' Discipline

Tooltip

Computer Science 4

80%

Mathematics 1

20%

Save time finding and organizing research with Mendeley

Sign up for free