Sign up & Download
Sign in

Empirical Tests of the Gradual Learning Algorithm

by Paul Boersma, Bruce Hayes
Linguistic Inquiry (2001)

Abstract

The Gradual Learning Algorithm (Boersma 1997) is a constraint-ranking algorithm for learning optimality-theoretic grammars. The purpose of this article is to assess the capabilities of the Gradual Learning Algorithm, particularly in comparison with the Constraint Demotion algorithm of Tesar and Smolensky (1993, 1996, 1998, 2000), whichinitiated the learnability research program for Optimality Theory. We argue that the Gradual Learning Algorithm has a number of special advantages: it can learn free variation, deal effectively with noisy learning data, and account for gradient well-formedness judgments. The case studies we examine involve Ilokano reduplicaiton and metathesis, Finnish genitive plurals, and the distribution of English light and dark /l/.

Cite this document (BETA)

Available from www.mitpressjournals.org
Page 1
hidden

Empirical Tests of the Gradual Learning Algorithm

Empirical Tests of the Gradual Learning Algorithm*
Paul Boersma Bruce Hayes
University of Amsterdam UCLA
September 29, 1999
Abstract
The Gradual Learning Algorithm (Boersma 1997) is a constraint ranking algorithm for
learning Optimality-theoretic grammars. The purpose of this article is to assess the
capabilities of the Gradual Learning Algorithm, particularly in comparison with the
Constraint Demotion algorithm of Tesar and Smolensky (1993, 1996, 1998), which
initiated the learnability research program for Optimality Theory. We argue that the
Gradual Learning Algorithm has a number of special advantages: it can learn free
variation, avoid failure when confronted with noisy learning data, and account for
gradient well-formedness judgments. The case studies we examine involve Ilokano
reduplication and metathesis, Finnish genitive plurals, and the distribution of English
light and dark /l/.
1 Introduction
Optimality Theory (Prince and Smolensky 1993) has made possible a new and fruitful approach
to the problem of phonological learning. If the language learner has access to an appropriate
inventory of constraints, then a complete grammar can be derived, provided there is an algorithm
available that can rank the constraints on the basis of the input data. This possibility has led to a
line of research on ranking algorithms, originating with the work of Tesar and Smolensky (1993,
1996, 1998; Tesar 1995) who propose an algorithm called Constraint Demotion, reviewed below.
Other work on ranking algorithms includes Pulleyblank and Turkel (1995, 1996, 1998, to
appear), Broihier (1995), and Hayes (1999).
Our focus here is the Gradual Learning Algorithm, as developed by Boersma (1997, 1998, to
appear). This algorithm is in some respects a development of Tesar and Smolensky’s proposal: it
directly perturbs constraint rankings in response to language data, and, like most previously
proposed algorithms, it is error-driven, in that it alters rankings only when the input data conflict
with its current ranking hypothesis. What is different about the Gradual Learning Algorithm is
the type of Optimality-Theoretic grammar it presupposes: rather than a set of discrete rankings,
it assumes a continuous scale of constraint strictness. Also, the grammar is regarded as
stochastic: at every evaluation of the candidate set, a small noise component is temporarily
added to the ranking value of each constraint, so that the grammar can produce variable outputs
if some constraint rankings are close to each other.

*
We would like to thank Arto Anttila for helpful input in the preparation of this paper. Thanks also to Louis
Pols, the University of Utrecht, and the UCLA Academic Senate for material assistance in making our joint work
possible. The work of the first author was supported by a grant from the Netherlands Organization for Scientific
Research.
Page 2
hidden
P A U L B O E R S M A A N D B R U C E H A Y E S 2
The continuous ranking scale implies a different response to input data: rather than a
wholesale reranking, the Gradual Learning Algorithm executes only small perturbations to the
constraints’ locations along the scale. We argue that this more conservative approach yields
important advantages in three areas. First, the Gradual Learning Algorithm can fluently handle
optionality; it readily forms grammars that can generate multiple outputs. Second, the algorithm
is robust, in the sense that speech errors occurring in the input data do not lead it off course.
Third, the algorithm is capable of developing formal analyses of linguistic phenomena in which
speakers’ judgments involve intermediate well-formedness.
A paradoxical aspect of the Gradual Learning Algorithm is that, even though it is statistical
and gradient in character, most of the constraint rankings it learns are (for all practical purposes)
categorical. These categorical rankings emerge as the limit of gradual learning. Categorical
rankings are of course crucial for learning data patterns where there is no optionality.
Learning algorithms can be assessed on both theoretical and empirical grounds. At the
purely theoretical level, we want to know if an algorithm can be guaranteed to learn all
grammars that possess the formal properties it presupposes. Research results on this question as
it concerns the Gradual Learning Algorithm are reported in Boersma (1997, 1998, to appear).
On the empirical side, we need to show that natural languages are indeed appropriately analyzed
with grammars of the formal type the algorithm can learn.
This paper focuses on the second of these two tasks. We confront the Gradual Learning
Algorithm with a variety of representative phonological phenomena, in order to assess its
capabilities in various ways. This approach reflects our belief that learning algorithms can be
tested just like other proposals in linguistic theory, by checking them out against language data.
A number of our data examples are taken from the work of the second author, who arrived
independently at the notion of a continuous ranking scale, and has with colleagues developed a
number of hand-crafted grammars that work on this basis (Hayes and MacEachern 1998; Hayes,
to appear).
We will begin by reviewing how the Gradual Learning Algorithm works, then present several
empirical applications. A study of Ilokano phonology shows how the algorithm can cope with
data involving systematic optionality. We also use a restricted subset of the Ilokano data to
simulate the response of the algorithm to speech errors. In both cases, we make comparisons
with the behavior of the Constraint Demotion Algorithm. We next turn to the study of output
frequencies, posed as an additional, stringent empirical test of the Gradual Learning Algorithm.
We use the algorithm to replicate the study of Anttila (1997a,b) on Finnish genitive plurals.
Lastly we turn to gradient well-formedness, showing that the algorithm can replicate the results
on English /l/ derived with a hand-crafted grammar by Hayes (to appear).
2 How the Gradual Learning Algorithm Works
Two concepts crucial to the Gradual Learning Algorithm are the continuous ranking scale and
stochastic candidate evaluation. We cover these first, then turn to the internal workings of the
algorithm.
2.1 The Continuous Ranking Scale
The algorithm presupposes a linear scale of constraint strictness, in which higher values
correspond to higher-ranked constraints. The scale is arranged in arbitrary units, and in principle

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Already have an account? Sign in

Readership Statistics

49 Readers on Mendeley
by Discipline
 
 
 
by Academic Status
 
35% Ph.D. Student
 
14% Post Doc
 
12% Professor
by Country
 
45% United States
 
8% United Kingdom
 
6% Netherlands