Learning in Wubble World
2007 IEEE 6th International Conference on Development and Learning (2007)
- ISBN: 9781424411153
- DOI: 10.1109/DEVLRN.2007.4354034
Available from
Wesley Kerr and Daniel Hewlett's profiles on Mendeley.
or
Abstract
Why do children master language so quickly and thoroughly, whereas gigabytes of text and enormously sophisticated learning algorithms produce at best shallow semantics in machines? Because children have help from competent speakers who relate language to what's happening in the child's environment. To facilitate the task of machine word learning, we developed a simulated environment, called "Wubble World," and populated it with entities called wubbles. Children interact with the wubbles using natural language, and act as teachers when the wubble needs help. This paper presents our word learning algorithms and provides some empirical results.
Page 1
Learning in Wubble World
Learning in Wubble World
Wesley Kerr, Shane Hoversten, Daniel Hewlett
Paul Cohen, Yu-Han Chang
USC Information Sciences Institute
4676 Admiralty Way
Marina del Rey, CA 90292
{wkerr,shane,dhewlett,cohen,ychang}@isi.edu
Abstract—Why do children master language so quickly and
thoroughly, whereas gigabytes of text and enormously sophisti-
cated learning algorithms produce at best shallow semantics
in machines? Because children have help from competent
speakers who relate language to what’s happening in the child’s
environment.
To facilitate the task of machine word learning, we devel-
oped a simulated environment, called “Wubble World,” and
populated it with entities called wubbles. Children interact with
the wubbles using natural language, and act as teachers when
the wubble needs help. This paper presents our word learning
algorithms and provides some empirical results.
Index Terms—language, development, online learning, se-
mantics, virtual environment
I. LEARNING LANGUAGE IN WUBBLE WORLD
The purpose of this project is to have machines learn
language in the same way as young children do. Machines
learn language models by extracting statistics from gigabytes
of text. Children learn language from competent, facilitative
speakers who strive to associate their language with what’s
going on in the child’s environment. When a child says,
“More!” the parent says, “More milk?” and points to the
empty milk glass.
Why do interactions like this produce such rapid mastery
of language, whereas gigabytes of text and sophisticated
learning algorithms produce at best shallow semantics? The
reason is that, to a human language learner, the semantics
of sentences are immediately accessible in scenes—in what
is going on when the language is uttered. In contrast, a
machine, given only text, with no access to scenes, can extract
only poor, shallow semantics from distributional statistics,
syntactic classes, and other structural features of the text.
To facilitate the task of word learning, we developed a sim-
ulated environment, called “Wubble World,” and populated it
with entities called wubbles (for a system overview see [9]).
Children interact with the wubbles using natural language,
and are told to treat them as younger siblings who might
need help in understanding what is said to them.
In Wubble World the child is given a task that her wubble
must accomplish in order for the child to advance in the
game. The wubble is told what to do in English; it can parse
sentences, but it doesn’t know what words mean. The wubble
hears a word, and looks at the scene. If it’s uncertain about
the word’s meaning it asks the teacher a question.
This protocol mimics the way children learn words: the
language is grounded - there is a scene to which it refers;
it is functional - understanding it lets the wubble interpret
its environment; there is a competent speaker to whom the
wubble can pose questions; and there is no negative feedback.
Although we describe the protocol in these terms, right
now Wubble World is in beta release and not accessible by
the general public. Since we are unable to present results
acquired using children as teachers, we instead present results
acquired using a simulated teacher. The rest of this paper will
focus on both the theory and the experimental results of the
wubble’s learning nouns, adjectives, and prepositions.
II. PREVIOUS WORK
Related work begins with SHRDLU, a blocks world cre-
ated by Terry Winograd [26]. This system generates and
understands natural language situated in a simulated world.
It could acquire new knowledge and learn about its environ-
ment, but it differs from Wubble World in that it is a purely
symbolic system.
Gorniak and Roy [8] developed a system in which two
subjects share a scene. The first subject selects an object from
the scene and describes it to the second subject, who must
then try to identify its referent. After collecting the language
data generated by the subjects, the authors created a computer
model to perform the same task. Finally, Gorniak and Roy
created a computer model that uses the data collected in
order to perform the same task as the human subjects. This
approach of gathering data and refining semantics is similar
in some ways to ours, although we use an online model of
word learning that allows instruction from a teacher.
Roy [18] also developed a batch-learning system using
human language data to generate natural language, using raw
image data coupled with natural language descriptions.
There is a significant body of research on the symbol
grounding problem in the connectionist network commu-
nity [21], [14], [7], [17]. Most focus on learning words
associated with images, and are trained and tested in batch
processes. Sankar and Gorin [20] developed a system similar
to a 2D Wubble World, which successfully learned 431 words
Wesley Kerr, Shane Hoversten, Daniel Hewlett
Paul Cohen, Yu-Han Chang
USC Information Sciences Institute
4676 Admiralty Way
Marina del Rey, CA 90292
{wkerr,shane,dhewlett,cohen,ychang}@isi.edu
Abstract—Why do children master language so quickly and
thoroughly, whereas gigabytes of text and enormously sophisti-
cated learning algorithms produce at best shallow semantics
in machines? Because children have help from competent
speakers who relate language to what’s happening in the child’s
environment.
To facilitate the task of machine word learning, we devel-
oped a simulated environment, called “Wubble World,” and
populated it with entities called wubbles. Children interact with
the wubbles using natural language, and act as teachers when
the wubble needs help. This paper presents our word learning
algorithms and provides some empirical results.
Index Terms—language, development, online learning, se-
mantics, virtual environment
I. LEARNING LANGUAGE IN WUBBLE WORLD
The purpose of this project is to have machines learn
language in the same way as young children do. Machines
learn language models by extracting statistics from gigabytes
of text. Children learn language from competent, facilitative
speakers who strive to associate their language with what’s
going on in the child’s environment. When a child says,
“More!” the parent says, “More milk?” and points to the
empty milk glass.
Why do interactions like this produce such rapid mastery
of language, whereas gigabytes of text and sophisticated
learning algorithms produce at best shallow semantics? The
reason is that, to a human language learner, the semantics
of sentences are immediately accessible in scenes—in what
is going on when the language is uttered. In contrast, a
machine, given only text, with no access to scenes, can extract
only poor, shallow semantics from distributional statistics,
syntactic classes, and other structural features of the text.
To facilitate the task of word learning, we developed a sim-
ulated environment, called “Wubble World,” and populated it
with entities called wubbles (for a system overview see [9]).
Children interact with the wubbles using natural language,
and are told to treat them as younger siblings who might
need help in understanding what is said to them.
In Wubble World the child is given a task that her wubble
must accomplish in order for the child to advance in the
game. The wubble is told what to do in English; it can parse
sentences, but it doesn’t know what words mean. The wubble
hears a word, and looks at the scene. If it’s uncertain about
the word’s meaning it asks the teacher a question.
This protocol mimics the way children learn words: the
language is grounded - there is a scene to which it refers;
it is functional - understanding it lets the wubble interpret
its environment; there is a competent speaker to whom the
wubble can pose questions; and there is no negative feedback.
Although we describe the protocol in these terms, right
now Wubble World is in beta release and not accessible by
the general public. Since we are unable to present results
acquired using children as teachers, we instead present results
acquired using a simulated teacher. The rest of this paper will
focus on both the theory and the experimental results of the
wubble’s learning nouns, adjectives, and prepositions.
II. PREVIOUS WORK
Related work begins with SHRDLU, a blocks world cre-
ated by Terry Winograd [26]. This system generates and
understands natural language situated in a simulated world.
It could acquire new knowledge and learn about its environ-
ment, but it differs from Wubble World in that it is a purely
symbolic system.
Gorniak and Roy [8] developed a system in which two
subjects share a scene. The first subject selects an object from
the scene and describes it to the second subject, who must
then try to identify its referent. After collecting the language
data generated by the subjects, the authors created a computer
model to perform the same task. Finally, Gorniak and Roy
created a computer model that uses the data collected in
order to perform the same task as the human subjects. This
approach of gathering data and refining semantics is similar
in some ways to ours, although we use an online model of
word learning that allows instruction from a teacher.
Roy [18] also developed a batch-learning system using
human language data to generate natural language, using raw
image data coupled with natural language descriptions.
There is a significant body of research on the symbol
grounding problem in the connectionist network commu-
nity [21], [14], [7], [17]. Most focus on learning words
associated with images, and are trained and tested in batch
processes. Sankar and Gorin [20] developed a system similar
to a 2D Wubble World, which successfully learned 431 words
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime
Start using Mendeley in seconds!
Readership Statistics
5 Readers on Mendeley
by Discipline
20% Psychology
by Academic Status
60% Ph.D. Student
20% Other Professional
20% Assistant Professor
by Country
60% United States
20% Russia
20% Canada



