A tutorial on hidden Markov models and selected applications in speech recognition
- ISSN: 00189219
- ISBN: 1558601244
- DOI: 10.1109/5.18626
- PubMed: 16892386
Abstract
This tutorial provides an overview of the basic theory of hidden Markov models (HMMs) as originated by L.E. Baum and T. Petrie (1966) and gives practical details on methods of implementation of the theory along with a description of selected applications of the theory to distinct problems in speech recognition. Results from a number of original sources are combined to provide a single source of acquiring the background required to pursue further this area of research. The author first reviews the theory of discrete Markov chains and shows how the concept of hidden states, where the observation is a probabilistic function of the state, can be used effectively. The theory is illustrated with two simple examples, namely coin-tossing, and the classic balls-in-urns system. Three fundamental problems of HMMs are noted and several practical techniques for solving these problems are given. The various types of HMMs that have been studied, including ergodic as well as left-right models, are described
A tutorial on hidden Markov models and selected applications in speech recognition
Selected Applications in Speech Recognition
LAWRENCE R. RABINER, FELLOW, IEEE
Although initially introduced and studied in the late 1960s and
early 1970s, statistical methods of Markov source or hidden Markov
modeling have become increasingly popular in the last several
years. There are two strong reasons why this has occurred. First the
models are very rich in mathematical structure and hence can form
the theoretical basis for use in a wide range of applications. Sec-
ond the models, when applied properly, work very well in practice
for several important applications. In this paper we attempt to care-
fully and methodically review the theoretical aspects of this type
of statistical modeling and show how they have been applied to
selected problems in machine recognition of speech.
I. INTRODUCTION
Real-world processes generally produce observable out-
puts which can be characterized as signals. The signals can
bediscrete in nature(e.g.,charactersfrom afinitealphabet,
quantized vectors from a codebook, etc.), or continuous in
nature (e.g., speech samples, temperature measurements,
music, etc.). The signal source can be stationary (i.e., its sta-
tistical properties do not vary with time), or nonstationary
(i.e., the signal properties vary over time). The signals can
be pure (i.e., coming strictly from a single source), or can
be corrupted from other signal sources (e.g., noise) or by
transmission distortions, reverberation, etc.
A problem of fundamental interest is characterizing such
real-world signals in terms of signal models. There are sev-
eral reasons why one is interested in applying signal models.
First of all, a signal model can provide the basis for a the-
oretical description of a signal processing system which can
be used to process the signal so as to provide a desired out-
put. For example if we are interested in enhancing a speech
signal corrupted by noise and transmission distortion, we
can use the signal model to design a system which will opti-
mally remove the noise and undo the transmission distor-
tion. A second reason why signal models are important is
that they are potentially capable of letting us learn a great
deal about the signal source (i.e., the real-world process
which produced the signal) without having to have the
sourceavailable. This property is especially important when
the cost of getting signals from the actual source is high.
Manuscript received January 15,1988; revised October 4,1988.
The author is with AT&T Bell Laboratories, Murray Hill, NJ 07974-
IEEE Log Number 8825949.
2070, USA.
In this case, with a good signal model, we can simulate the
source and learn as much as possible via simulations.
Finally, the most important reason why signal models are
important is that they often workextremelywell in practice,
and enable us to realize important practical systems-e.g.,
prediction systems, recognition systems, identification sys-
tems, etc., in a very efficient manner.
These are several possible choices for what type of signal
model is used for characterizing the properties of a given
signal. Broadly one can dichotomize the types of signal
models into the class of deterministic models, and the class
of statistical models. Deterministic models generally exploit
some known specific properties of the signal, e.g., that the
signal is a sine wave, or a sum of exponentials, etc. In these
cases, specification of the signal model is generally straight-
forward;all that is required istodetermine(estimate)values
of the parameters of the signal model (e.g., amplitude, fre-
quency, phase of a sine wave, amplitudes and rates of expo-
nentials, etc.). The second broad class of signal models is
the set of statistical models in which one tries to charac-
terize only the statistical properties of the signal. Examples
of such statistical models include Gaussian processes, Pois-
son processes, Markov processes, and hidden Markov pro-
cesses, among others. The underlying assumption of the
statistical model is that the signal can be well characterized
as a parametric random process, and that the parameters
of the stochastic process can be determined (estimated) in
a precise, well-defined manner.
For the applications of interest, namely speech process-
ing, both deterministic and stochastic signal models have
had good success. In this paper we will concern ourselves
strictlywith one typeof stochastic signal model, namelythe
hidden Markov model (HMM). (These models are referred
to as Markov sources or probabilistic functions of Markov
chains in the communications literature.) We will first
review the theory of Markov chains and then extend the
ideas to the class of hidden Markov models using several
simple examples. We will then focus our attention on the
three fundamental problems' for HMM design, namely: the
'The idea of characterizing the theoretical aspects of hidden
Markov modeling in terms of solving three fundamental problems
is due to Jack Ferguson of IDA (Institute for Defense Analysis) who
introduced it in lectures and writing.
0018-9219/89/02000257$01.00 1989 IEEE
PROCEEDINGS OF THE IEEE, VOL. 77, NO. 2, FEBRUARY 1989
257
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime





