Sign up & Download
Sign in

Hierarchical voting experts: An unsupervised algorithm for hierarchical sequence segmentation

by Matthew Miller, Alexander Stoytchev
International Conference on Development and Learning ICDL08 (2008)

Cite this document (BETA)

Available from ieeexplore.ieee.org
Page 1
hidden

Hierarchical voting experts: An unsupervised algorithm for hierarchical sequence segmentation

Hierarchical Voting Experts: An Unsupervised
Algorithm for Hierarchical Sequence Segmentation
Matthew Miller and Alexander Stoytchev
Developmental Robotics Lab
Iowa State University
{mamille | alexs}@iastate.edu
Abstract—This paper extends the Voting Experts (VE) algo-
rithm for unsupervised segmentation of sequences to create the
Hierarchical Voting Experts (HVE) algorithm for unsupervised
segmentation of hierarchically structured sequences. The paper
evaluates the strengths and weaknesses of the HVE algorithm to
identify its proper domain of application. The paper also shows
how higher order models of the sequence data can be used to
improve lower level segmentation accuracy.
I. INTRODUCTION
The world is too complex to be considered all at once, both
computationally and conceptually. Instead, it must be broken
into manageable pieces, or chunks, and dealt with one piece at
a time [1]. However, this is not a trivial task. It isn’t clear what
segmentation strategy one should use, or even what metric
should be used to evaluate the quality of a segmentation.
Human beings have an astounding and apparently innate
ability to induce such a segmentation [2], and this mechanism
has been variously described and measured [1], [3], [4], [5],
[6]. Modeling this process would certainly be an academically
and practically fruitful endeavor. The Voting Experts (VE) al-
gorithm suggests just such a model, and has demonstrated the
capability to accurately segment natural language text [7]. It
proposes that chunks have a certain signature, i.e., they exhibit
two information theoretic characteristics, namely low internal
entropy and high boundary entropy. In other words, chunks
are composed of elements that are frequently found together,
and that are found together in many different circumstances.
VE looks for these two properties and uses them to segment
text. It is surprisingly powerful given its simplicity, suggesting
that the principle of segmenting based on low internal entropy
and high boundary entropy is promising.
Real world data often exhibits an inherently hierarchical
structure, and it is well known that humans chunk the world
hierarchically [1], [3]. When we read text our eyes scan the
letters and sense black and white shapes. These shapes are
chunked into letters, which are chunked together into words,
which are chunked into phrases and so on. This hierarchical
grouping is fundamental to our interaction with the world.
This paper extends the VE algorithm to segment hier-
archically structured sequences. We show that VE can be
generalized to work on hierarchical data and investigate the
applicability of this extension to determine its strengths and
limitations. More specifically, we strive to understand when
the underlying information theoretic model for segmentation
is valid, and when it is not. We then show that the higher
order models can be used to improve the accuracy of the
segmentation at lower levels.
II. RELATED WORK
Several algorithms have been described in the literature
for unsupervised sequence segmentation. In particular there
exist segmentation algorithms that use statistical properties of
sequences [8], [9], [10], [11], [12]. There also exist models of
infant speech segmentation based on clustering or Bayesean
approaches [13], [14]. Additionally the SEQUITUR algorithm
has demonstrated the ability to discover hierarchical structure
in sequence data, and has been altered to perform unsuper-
vised segmentation tasks [15], [7]. However, its segmentation
performance is inferior to that of VE [7].
The work presented here, however, is more closely related
to the field of Statistical Learning. A paper by Saffran, John-
son, Aslin and Newport demonstrated that humans possess
a general mechanism for segmenting audio data [5]. They
claim that the segmentation was induced based on “statistical
cues.” These are the “sequential properties” of the phonemes
or tones [5]. Specifically, given two sequential tones A and B,
the probability that B follows A is generally higher if the two
tones are part of the same word, and generally lower if there is
a word break between them. The study concludes that humans
must use these statistical cues to segment audio streams [5].
But these cues are simply more impoverished versions of the
“low internal entropy” and “high boundary entropy” signatures
of chunks used by VE.
When we say that a sequence has low internal entropy,
this literally means that the transition probability between
each element in the sequence is high. When we say that a
sequence has high boundary entropy we literally mean that,
given the sequence, there is no particular element that has a
high probability of being next. Specifying these markers in
terms of information theory [16] gives us a very clear and
well understood characterization. The VE model can be seen
as a refinement and artificial implementation of this model of
human segmentation. This model may or may not capture the
true human strategy. However, it seems complimentary to the
findings of Saffran, Aslin and others.

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Already have an account? Sign in

Readership Statistics

11 Readers on Mendeley
by Discipline
 
 
 
by Academic Status
 
36% Ph.D. Student
 
18% Other Professional
 
18% Post Doc
by Country
 
64% United States
 
9% Japan
 
9% Russia

Groups

Everything