Sign up & Download
Sign in

Experimental Syntax: Exploring the effect of repeated exposure to anomalous syntactic structure -- evidence from rating and reading tasks

by Jerid Francom
(2009)

Cite this document (BETA)

Available from Jerid Francom's profile on Mendeley.
Page 1
hidden

Experimental Syntax: Exploring the effect of repeated exposure to anomalous syntactic structure -- evidence from rating and reading tasks

EXPERIMENTAL SYNTAX: EXPLORING THE EFFECT OF
REPEATED EXPOSURE TO ANOMALOUS SYNTACTIC
STRUCTURE –EVIDENCE FROM RATING AND READING
TASKS
by
Jerid Cole Francom
A Dissertation Submitted to the Faculty of the
DEPARTMENT OF LINGUISTICS
In Partial Fulfillment of the Requirements
For the Degree of
DOCTOR OF PHILOSOPHY
In the Graduate College
THE UNIVERSITY OF ARIZONA
2 0 0 9
Page 2
hidden
2THE UNIVERSITY OF ARIZONA
GRADUATE COLLEGE
As members of the Final Examination Committee, we certify that we have read the
dissertation prepared by Jerid Cole Francom
entitled Experimental Syntax: exploring the effect of repeated exposure to anoma-
lous syntactic structure –evidence from rating and reading tasks
and recommend that it be accepted as fulfilling the dissertation requirement for the
Degree of Doctor of Philosophy.
Date: 14 May 2009
Simin Karimi
Date: 14 May 2009
Andrew Barss
Date: 14 May 2009
Kenneth Forster
Date: 14 May 2009
Michael Hammond
Final approval and acceptance of this dissertation is contingent upon the
candidate’s submission of the final copies of the dissertation to the Graduate
College.
I hereby certify that I have read this dissertation prepared under my direction and
recommend that it be accepted as fulfilling the dissertation requirement.
Date: 14 May 2009
Dissertation Director: Janet Nicol
Page 3
hidden
3STATEMENT BY AUTHOR
This dissertation has been submitted in partial fulfillment of requirements for an
advanced degree at The University of Arizona and is deposited in the University
Library to be made available to borrowers under rules of the Library.
Brief quotations from this dissertation are allowable without special permission,
provided that accurate acknowledgment of source is made. Requests for permission
for extended quotation from or reproduction of this manuscript in whole or in part
may be granted by the head of the major department or the Dean of the Graduate
College when in his or her judgment the proposed use of the material is in the
interests of scholarship. In all other instances, however, permission must be obtained
from the author.
SIGNED: Jerid Cole Francom
Page 4
hidden
4ACKNOWLEDGEMENTS
Although this dissertation was written by me, it was a collaborative effort and would
not have come to fruition without the help of many over a number of years. I will
not be able to recognize nor fully thank everyone who has supported me in this
effort in particular, and in my studies more generally, but I hope to name a few
people who made special and direct contributions to the work documented in this
thesis and my growth as an academic and as a person.
First and foremost I would like to express my gratitude to my chair Janet Nicol.
Our weekly meetings, email exchanges and walks on the mall were invaluable to the
development of the ideas in this work. I will always be grateful for her intellectual
insights and keen ability to instill calm in an otherwise turbulent process.
I want to thank the other members of my advising committee; Simin Karimi,
Ken Forster, Mike Hammond and Andy Barss. I feel very fortunate to have been
able to assemble such a capable and caring group of people to collaborate on this
project. I will never forget our roundtable discussions –extremely engaging and
productive– it was an honor to work with such a talented group, thank you.
I have to thank two individuals that share the blame for leading me on my current
academic path. I would not be working in this area if it were not for the intervention
of Eliud Chuffe during a summer course I took for a language requirement. His
encouragement that I study Spanish at the graduate level led to my contact with
language study and modern linguistic theory. However, without Antxon Olarrea the
seed planted might not have taken root. Between A-bar and Zappa, Antxon showed
me that there was a place for my rebel spirit and intellectual curiosity in academia.
His keen insight into scientific inquiry, dedication to students and love for life helped
me realize this thesis and will continue to inspire me throughout my days.
I am fortunate to have had shared my graduate career with great and talented
students along the way. I would like to give special thanks to members of my weekend
support group: Mercedes Tubino-Blanco and Mans Hulden. Our discussions over
spirits, peppered with comments on language, linguistics, politics and life, tied things
together for me and were a source of relief and a harbor of sanity. Cheers.
I cannot conclude without expressing my indebtedness to my family. As a young-
ster I did not see the value of Saturday morning yard work and weekly chores, I do
now. Mom and Dad, the work ethic you instilled in me over the years made this de-
gree possible –academia is cake compared to the hot Arizona sun, cactus thorns and
the occasional rattlesnake! Finally, I am completely indebted to my wife Claudia
and daughter Ine´s for being there, good times and bad. Thank you for making our
house a home, a source of stability and love. Without you none of this is possible.
Page 5
hidden
5TABLE OF CONTENTS
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
CHAPTER 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.1 Linguistic intuition and syntactic theory . . . . . . . . . . . . . . . . 12
1.2 Research questions and overview . . . . . . . . . . . . . . . . . . . . . 18
CHAPTER 2 The Syntactic Satiation Effect . . . . . . . . . . . . . . . . . . 21
2.1 Initial evidence for the Satiation effect . . . . . . . . . . . . . . . . . 21
2.1.1 Snyder 2000 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.1.2 A memory bottleneck . . . . . . . . . . . . . . . . . . . . . . . 24
2.2 Other findings for exposure-based rating change . . . . . . . . . . . . 26
2.2.1 An equalization strategy . . . . . . . . . . . . . . . . . . . . . 27
2.3 Summary and predictions . . . . . . . . . . . . . . . . . . . . . . . . 30
2.4 Experiment 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.4.1 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.4.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.5.1 Methodological issues . . . . . . . . . . . . . . . . . . . . . . . 38
2.5.2 Assessing the influence of response bias . . . . . . . . . . . . . 41
2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
CHAPTER 3 Repetition effects in sentence processing . . . . . . . . . . . . 45
3.1 Syntactic Priming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.1.1 Language production . . . . . . . . . . . . . . . . . . . . . . . 46
3.1.2 Language comprehension . . . . . . . . . . . . . . . . . . . . . 47
3.2 Implicit learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.3 Is Syntactic Satiation an instance of Syntactic Priming? . . . . . . . . 50
3.4 Summary and predictions . . . . . . . . . . . . . . . . . . . . . . . . 53
3.5 Experiment 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.5.1 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.5.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Page 6
hidden
TABLE OF CONTENTS – Continued
6
3.7 Experiment 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.7.1 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.7.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.8 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.9 General discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
CHAPTER 4 Coordinating rating and reading time measures . . . . . . . . 67
4.1 Mitigating variability in rating tasks . . . . . . . . . . . . . . . . . . 67
4.2 Variability introduced by experimental methods . . . . . . . . . . . . 69
4.3 Assessing processing effects . . . . . . . . . . . . . . . . . . . . . . . . 71
4.4 Summary and predictions . . . . . . . . . . . . . . . . . . . . . . . . 73
4.5 Experiment 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.5.1 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.5.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.7 Experiment 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.7.1 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.7.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.8 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
4.9 General discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.9.1 Task-specific effects . . . . . . . . . . . . . . . . . . . . . . . . 88
4.9.2 Processing-based effects . . . . . . . . . . . . . . . . . . . . . 88
4.9.3 Residual issues . . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
CHAPTER 5 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . 96
APPENDIX A Experiment materials . . . . . . . . . . . . . . . . . . . . . . 102
A.1 Experimental items . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
A.1.1 Experiment 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
A.1.2 Experiment 2 and 3 . . . . . . . . . . . . . . . . . . . . . . . . 104
A.1.3 Experiment 4 and 5 . . . . . . . . . . . . . . . . . . . . . . . . 108
A.2 Experimental instructions . . . . . . . . . . . . . . . . . . . . . . . . 112
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
Page 7
hidden
7LIST OF FIGURES
1.1 Object versus subject extraction in English and Spanish. . . . . . . . 15
1.2 Contrasts for type of pre-verbal subject in Spanish. . . . . . . . . . . 16
1.3 Overall and repetition mean acceptability. . . . . . . . . . . . . . . . 17
2.1 Equalization: the proposed effect of task balance on acceptability
rating change. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.2 Number of participants with response changes of type ‘Increase’ (more
‘no’ to ‘yes’ responses) or ‘Decrease’ (more ‘yes’ to ‘no’ responses) by
sentence type in Experiment 1. . . . . . . . . . . . . . . . . . . . . . 34
2.3 Overall acceptability for each sentence type in Experiment 1. (Gram-
matical distractors are labeled as ‘Filler’.) . . . . . . . . . . . . . . . 35
2.4 Initial acceptability differences for Want for and That-trace sentences. 38
2.5 Rating stability: number of stable and unstable responses in Experi-
ment 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.1 Overall mean acceptability for Experiment 2. . . . . . . . . . . . . . . 57
3.2 Acceptability ratings by exposure set in Experiment 2. . . . . . . . . 58
3.3 Rating stability: number of stable and variable responses in Experi-
ment 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.4 Illustration of the Maze task (presented one line at a time). . . . . . . 62
3.5 Overall mean acceptability for Experiment 3. . . . . . . . . . . . . . . 63
3.6 Acceptability ratings by exposure set in Experiment 3. . . . . . . . . 63
4.1 Overall acceptability for bias conditions in Experiment 4. . . . . . . . 77
4.2 Stability measures for bias conditions in Experiment 4. . . . . . . . . 80
5.1 Relative contrasts across experimental tasks. . . . . . . . . . . . . . . 99
5.2 Overall and change scores for acceptability judgments. . . . . . . . . 100
Page 8
hidden
8LIST OF TABLES
2.1 Examples of sentence types used in Snyder (2000). . . . . . . . . . . . 23
2.2 Findings Summary for Satiation Effects. . . . . . . . . . . . . . . . . 28
2.3 Target ungrammatical items for Experiment 1. . . . . . . . . . . . . . 31
2.4 Sign test calculation formula. . . . . . . . . . . . . . . . . . . . . . . 33
2.5 Percent acceptable by repetition for each sentence type ordered by
overall acceptability in Experiment 1. . . . . . . . . . . . . . . . . . . 36
2.6 Stable versus unstable responses. . . . . . . . . . . . . . . . . . . . . 40
2.7 Maclay and Sleator (1960) . . . . . . . . . . . . . . . . . . . . . . . . 42
3.1 Examples of moderately grammatical sentence types in Luka and
Barsalou (2005) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.2 Target Items for experiment 2. . . . . . . . . . . . . . . . . . . . . . . 55
3.3 Mean ratings for First and Last Set for Experiment 1 and 2. . . . . . 59
3.4 Rating stability: number of increased, decreased and stable responses
in Experiment 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.5 Mean ratings for First and Last Set for Experiment 1, 2 and 3. . . . . 65
4.1 Additional violation types in Experiment 4 . . . . . . . . . . . . . . . 75
4.2 Grammatical bias . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.3 Ungrammatical bias . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.4 Mean rating change across types by bias condition . . . . . . . . . . . 78
4.5 Complete list of variance and deviance scores for the grammatical
bias condition in Experiment 4. (Bold - strong effect, Underline-
marginal effect) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.6 Complete list of variance and deviance scores for the ungrammatical
bias condition in Experiment 4. (Bold - strong effect, Underline-
marginal effect) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.7 Ungrammatical Sentence Types in Experiment 5 . . . . . . . . . . . . 83
4.8 Residual Response Times for critical regions in Experiment 5. . . . . 85
Page 9
hidden
9ABSTRACT
This thesis explores the nature of linguistic introspection through the phenomenon
known in the literature as the Syntactic Satiation Effect, where the perceived unac-
ceptability of some syntactic structures is attenuated on repeated exposure. Recent
findings suggest that rating change in experimental settings may not reveal the un-
derlying grammatical status of syntactic objects by mitigating performance factors
related to memory limitations, as initially proposed, but rather arise as a response
bias conditioned by characteristics of some experimental designs, in effect introduc-
ing task-based performance factors. Findings from rating and reading times suggest
that there is evidence supporting both accounts of rating change in experimental
designs and highlights areas of development for the Experimental Syntax program.
Exploring anecdotal reports, Snyder (2000) found that in as few as five ex-
posures, participants found some types of wh-extraction anomaly (‘weak Islands’)
significantly more acceptable at the end of the session compared to the beginning
whereas others (‘strong Islands’) did not experience any rating improvement. Varied
success in replicating initial results casts doubts on the proposal that rating data,
experimentally elicited, can tease apart grammatical from performance sources of
unacceptability. Sprouse (2009) suggests an alternative –Satiation arises as an ar-
tifact of a disproportionate number of ungrammatical to grammatical sentences in
the testing session. This approach provides an explanation for the apparent mis-
match in findings, but also highlights issues regarding the advances of experimental
syntax: do experimental methods provide better data or do aspects of some designs
systematically introduce extraneous influences themselves?
Evidence from three rating and two self-paced reading tasks suggests that al-
though robust evidence supporting the memory-based claim is not found, evidence
that Satiation is strictly task-based is not substantiated either; sentences that sa-
Page 10
hidden
10
tiate are similar across experiments. A novel observation is made that satiating
sentences are also more readily interpretable than non-satiating sentences – provid-
ing some explanation for the apparent mismatch between Satiation studies, and also
points to another source of variability associated with experimental approaches to
linguistic intuition. In sum, evidence here underlines the composite nature of intro-
spection, points areas of refinement for experimental techniques and advocates for
the adoption of cross-methodological procedures to enhance syntactic investigation.
Page 14
hidden
14
Furthermore, a native speaker can only report the acceptability of an utterance,
not its grammaticality. Grammaticality is an internalized status which characterizes
a linguistic sequence generated by the grammar, the sum of the rules and principles
that form a speaker’s grammatical ‘competence’. However, acceptability is subject
is only partially derived from grammaticality, a host of other non-syntactic ‘per-
formance’ factors potentially enter into the introspective process (to more or less
degree as in naturally occurring language use.). Herein lies the key distinction made
between Competence and Performance (Chomsky, 1964, 1965).
Although the advantages to introspective data are apparent, a paradox arises:
no data from competence can be obtained without engaging performance to some
degree. This inextricable link between competence and performance, and the com-
posite nature of linguistic intuition has been accepted, by and large, by linguists
as part and parcel of “doing syntax”. However a growing research program in Ex-
perimental Syntax suggests that while typical methods employed by linguists are
adequate to deal with clear-cut cases as in (1) and (2) this type of approach is less
equipped to evaluate more subtle cases of ungrammaticality as in (3).
(3) a. ??Which car did John ask how Mary fixed?
b. ?Who did John ask which car fixed?
Proponents of experimental approaches to syntactic inquiry suggest that by
adopting more standard “group testing” practices from psychology, variability intro-
duced by small sample sizes and unstandardized collection practices can be mitigated
(Schu¨tze, 1996; Cowart, 1997; Tremblay, 2005; Sprouse, 2007c, inter alia). More-
over, experimental syntax introduces the prospect of revealing contrasts between
structures that cannot be readily detected with standard techniques.
Evidence that experimental approaches can reveal contrasts for syntactic struc-
tures that have gone undetected through informal techniques come from both typi-
cally accepted grammatical and ungrammatical structures. In the case of grammat-
ical structures, it is widely accepted that the That-trace filter, active languages such
as English, is not operative in Spanish as seen in the contrast in (4b) and (5b).
Page 17
hidden
17
The cases cited here from Goodall (2004) and Montrul et al. (2008) are examples
of a growing trend to apply experimental techniques to syntactic investigation (Na-
gata, 2003; Featherston, 2005; Sorace and Keller, 2005, inter alia). Most typically
experimental techniques have focused on the mean sentence acceptability scores
for an entire session, illustrated in Figure 1.3a. Yet data collection methods im-
plemented in experimental designs allow researchers another type of view into the
introspective process – change in mean acceptability scores over the course of an
experiment, Figure 1.3b.
Type A Type B
Overall Contrasts
Sentence Types
Mea
n Ac
cepta
bility
(15
scale
)
0
1
2
3
4
5
(a) Contrasts between sentence types.
1 2 3 4 5
Type A
Repetitions
Mea
n Ac
cepta
bility
(15
scale
)
0
1
2
3
4
5
(b) Contrasts per repetition for a single type.
Figure 1.3: Overall and repetition mean acceptability.
Snyder (2000) explored the possibility that rating change in an experimental
session could provide corroborating evidence for anecdotal accounts from syntacti-
cians that initial perception of unacceptability for some syntactic anomaly appears
to be attenuated with more exposure and familiarity. Known informally as ‘judg-
ment fatigue’, ‘ the linguist disease’, etc., experimental findings for seven distinct
wh-Islands show that participants’ responses corroborate informal evidence; some
types of anomaly are rated more favorably with increased exposure and repeated
evaluation, whereas other are not. Rating improvement or ‘Satiation’ for classi-
cally ‘weak’ Islands (Subjacency violations) was taken as evidence that the surface
Page 18
hidden
18
unacceptability of these structures is due to processing constraints related to work-
ing memory limitations and not to syntactic constraint. The basic idea being that
memory resources are inherently more capable to adapt to changing demands than
grammatical constraint, which by hypothesis dichotomous and immutable. 3
Snyder’s basic proposal suggests that experimental approaches to syntax can pro-
vide a novel tool for syntacticians to evaluate the underlying grammatical status of
syntactic structure – effectively creating a diagnostic test able to tease apart gram-
matical from performance sources of unacceptability. However, subsequent studies
following up this line of inquiry report varied success in replicating Snyder’s results,
casting doubts on the initial account for the Satiation effect. Sprouse (2007c) sug-
gests that increased ratings in judgment tasks arise as artifacts of particular design
conditions employed in Satiation studies, namely a disproportionate number of un-
grammatical to grammatical sentences in the testing session. If on the right track,
this approach provides some explanation for the apparent mismatch in results for
Satiation effects, but also highlights larger issues regarding the advances of exper-
imental syntax in general: do experimental methods provide more revealing data
or do particular design aspects applied in some experimental designs systematically
introduce extraneous influences themselves?
1.2 Research questions and overview
This thesis has three main goals:
1. to provide a detailed investigation into the potential sources of rating change
in Satiation studies.
2. to evaluate the extent to which experimental approaches to acceptability judg-
ment tasks provide a clearer view into the structure of tacit linguistic knowl-
3Another approach to the nature of grammar suggests that grammatical constraints are better
understood as gradient and violable, and potentially subject to change over time. See Bard et al.
(1996); Keller (1996); Sorace and Keller (2005).
Page 19
hidden
19
edge compared to traditional methods, i.e. better equipped to mitigate extra-
neous performance factors associated with linguistic intuition.
3. to explore the nature of linguistic introspection and the relationship between
linguistic knowledge, linguistic behavior and linguistic theory.
The investigation begins the with a closer look at evidence for the Syntactic
Satiation effect. Two main proposals are explored. First, Snyder’s initial hypothesis
that satiating sentence types are not ungrammatical, but rather difficult to process,
is reviewed. Mixed findings in replication studies following up this line of inquiry
feed a second proposal: Satiation effects are not linguistically-based but rather
essentially task-based and arise as artifacts under particular experimental designs
typically employed in Satiation studies. Evidence from Experiment 1 suggests that
although robust evidence supporting the sentence processing claims for Satiation
is not found, evidence that the effect is strictly non-linguistic is not substantiated
either; the types of structures that satiate are similar across experiments. A novel
approach is introduced based on the observation that satiating sentences tend to
be more readily ‘interpretable’ when compared to non-satiating sentences. This
account puts forward the hypothesis that participants confuse interpretability with
grammaticality in task-bias conditions, again pointing to experimental confounds,
not asymmetries in processing effects, as the primary source of Satiation effects.
Chapter 3 attempts to address the disparity between replication studies by look-
ing at the parallels between Satiation effects and a more well-studied phenomenon
on repetition effects in language, Syntactic Priming. A review of the literature sug-
gests, that, whereas priming for syntactic form is robustly found in production, and
to some extent in comprehension, it is unknown whether there are priming effects
for anomalous syntactic form: 1) given no priming studies have investigated decid-
edly anomalous syntactic structures and 2) rating improvement in Satiation studies
have only been found in ungrammatically-biased judgment tasks. Taken together
a prediction is made: if Satiation is an instance of Syntactic Priming, rating im-
provement should be a function of mere exposure, not task type nor task balance.
Page 20
hidden
20
Evidence from Experiment 2 and 3 suggest that satiation does occur in balanced
task designs but may in fact be contingent on the judgment process itself –mixed
finding in support of the notion that Syntactic Satiation is an instance of Syntactic
Priming.
Chapter 4 uses both rating and reading time measures to explore the extent to
which Satiation effects arise due to facilitated processing of syntactic structure or
rather as an artifact of the judgment process itself. Despite claims that repeated
exposure to sentence anomaly in a rating task facilitates processing, it has not
been corroborated in tasks more typically employed in psycholinguistics, such as
self-paced reading. Given the composite nature of the judgment process, and null
results for word-by-word reading exposure in Experiment 3, the prediction is that
Satiation effects stem from judgment process itself and not from facilitated syntactic
processing. Results from a two-part experiment in which participants were exposed
to anomalous sentences in a rating task and subsequently participated in a reading
task suggest that Satiation effects do appear to have a processing component. Al-
though the findings here make an important contribution to the connection between
syntactic processing and linguistic intuition for unacceptable sentence types, there
is also evidence that participants deploy non-linguistic strategies that pose obstacles
to clear conclusions about the nature of Syntactic Satiation effects.
The results from the five experiments presented here suggest that Satiation ef-
fects are replicable in rating tasks. However it is less clear what underlies these
effects. Evidence gathered cannot exclude the possibility that the memory limita-
tions, Syntactic Priming or test-taking strategies play some role in Satiation effects.
But any one of these factors alone can not account for the variety of sentence types
that undergo rating improvement and exclude those that do not given the evidence.
Thus, results here suggest that adoption of group testing practices, typically em-
ployed in experimental approaches to syntactic inquiry, can inadvertently introduce
systematic non-syntactic differences interpreted as syntactically relevant. These
findings point to potential shortcomings of the Experimental Syntax program as
currently practiced and highlight areas for future investigation.
Page 22
hidden
22
introduced the term ‘Satiation’ into the literature to describe changes in judgments
with exposure to particular syntactic structures. The term makes reference to Se-
mantic Satiation a phenomenon first documented by Severance and Washburn (1907)
in which repeated exposure to a word produces a sense of detachment, or ‘foreign-
ness’ to the form of the word and its meaning.1 Syntactic Satiation, on the other
hand, appears from anecdotal evidence to produce quite the opposite effect – those
that experience the effect have the feeling that they understand the utterance bet-
ter.2 Consider a sentence such as in (1). Linguists’ ability to clearly respond ‘Jill’ (or
some other name) to this question underscores this key difference between semantic
and syntactic satiation effects.
(1) Who does Mary believe the claim that John saw?
To some degree all syntacticians experience this sensation for some types of
syntactic structure, often forgoing intuitions for commonly assumed ungrammatical
sentence types and relying on memorized judgments as a source of stability. This
phenomenon has been codified into theory as the observational distinction between
‘weak’ (2), marginally unacceptable and ‘strong’, robustly unacceptable (3 and 4)
syntactic Island constraints (Ross, 1967). This distinction has been theoretically
subsumed by Subjacency violations (2) on the one hand and violations of the Con-
dition on Extraction Domains (CED) (3) or the Empty Category Principle (ECP)
(4) on the other (Lasnik, 1984).
(2) What did John ask who wrote? (Subjacency)
1Spencer, however, did not imply that the satiation process itself lead to changes in introspective
judgments. In order to explain why linguists and na¨ıve participants have contrasting judgments, he
suggests three separate processes bias those with linguistic training towards particular analyses of
syntactic structure. First, satiation applies on repeated exposure, detaching initial form-meaning
relations. Second, a reorganization process attempts to categorize the utterance. At this point in
the process, a lack of clear linguistic reorganizing principles is filled by theoretical expectations
and assessments of syntactic form ‘contextually stabilizes’ the response in a particular way, based
on particular knowledge of proposed rules acquired in linguistic training.
2The confusion in terms is problematic. The term ‘training effects’ introduced for other reasons
may be less problematic.
Page 25
hidden
25
Investigations into the comprehension of long-distance dependencies cite two
dynamics related to working memory constraints that lead to graded performance:
1) structural distance between the filler and gap and 2) the referential status of the
DPs that intervene between filler and gap (Fodor, 1978; Gibson, 1998; Kluender,
1998; Gibson and Warren, 2004; Kluender, 2005). Evidence for the influence of
structural distance comes from relatives, subject relatives (7) are read quicker and
more accurately than object relatives (8) (King and Just, 1991).
(7) The reporteri that ti attacked the senator admitted the error.
(8) The reporteri that the senator attacked ti admitted the error.
The notion is that a filler (‘the reporter’, in this case) must be held in working
memory until a gap (the structural position where the filler is to be interpreted) can
be found. Working memory allocations decay over time, and the longer the distance
between filler and gap, the more difficult the relationship becomes to establish.
The referential status of DP fillers and DPs that intervene between the filler
and gap also have an effect on the sentence parser’s ability to parse dependency
relations. In general terms, the more referential (Discourse Linked) a DP is, the
more memory resources it requires, and higher activation levels it achieves. For
interveners, the more referential, the more disruptive to a successful parse. Hence,
the referential intervener ‘the reviewer’ makes (9) more difficult to process than (10)
where a non-referential DP ‘someone’ appears.
(9) That’s the articlei that we need to find the reviewer who understands ti.
(10) That’s the articlei that we need to find someone who understands ti.
(Kluender, 1998)
On the other hand, more referential fillers (11) are more resilient to intervening
material, and thus are more comprehensible than non-referential fillers (12).
Page 30
hidden
30
predicted outcome in these types of experimental setups.
2.3 Summary and predictions
The review of the primary literature provided in this chapter highlights two po-
tential sources for the Satiation Effect: a Memory Bottleneck and an Equalization
Strategy. Each approach makes distinct predictions about outcomes for a replication
of Snyder’s original experiment. First, the Memory Bottleneck hypothesis predicts
that syntactic structures that are hypothesized to be perceived as anomalous due to
demands on working memory in on-line comprehension (Subjacency effects) should
show increased ratings as a function of exposure. The apparent difficulty found by
the cited authors in consistently replicating these particular effects raises questions
as to whether memory limitations are at the root and/or the sole source of these
effects. On the other hand, the Equalization strategy claims that the apparent
mismatch in results is evidence in support of the notion that the underlying root
of Satiation Effects is non-linguistic, and based on more general decision-making
heuristics. This approach predicts inherent variability in Satiation studies that em-
ploy unbalanced experimental designs. However, the results of previous work have
not been as inconsistent as expected on a response bias account; findings are robust
for some satiating and non-satiating types across experiments. The following exper-
iment aims to add another attempt to replicate Snyder’s initial results in order to
assess the degree of variability associated with experimental approaches to rating
change.
2.4 Experiment 1
2.4.1 Methods
Participants
205 undergraduate students from six introductory linguistics courses at the Univer-
sity of Arizona gave informed consent to participate in a short in-class experiment
Page 31
hidden
31
during regularly scheduled class time. Participants completed a brief questionnaire
soliciting information on age, sex, place of birth, native language, bilingualism, and
linguistic training.
13 participants were excluded as non-native speakers of English. The remaining
192 were born in a variety of states across the United States 6, however almost
half were born in Arizona (82). 61 claimed linguistic proficiency in a language other
than English (37 of which included Spanish). 13 were considered trained in linguistic
theory. Participants ranged in age from 18 to 33 years of age (mean 20.8).
Materials and procedure
Each participant rated a total of 50 interrogative sentences. These sentences were
the same sentences and sentence types used in Snyder (2000).7 These included
35 ungrammatical target sentences and 15 grammatical distractors. The target
ungrammatical items consisted of seven typical wh-extraction violations seen in
Table 2.3.
Want For Who does John want for Mary to meet?
Whether Who does John wonder whether Mary likes?
That-trace Who does Mary think that likes John?
Subject Island What does John know that a bottle of fell on the floor?
Complex NP Who does Mary believe the claim that John likes?
Adjunct Island Who did John talk with Mary after seeing?
Left Branch How many did John buy books?
Table 2.3: Target ungrammatical items for Experiment 1.
The experimental items were organized into five blocks of ten items. Each vio-
lation appeared once in each block along with three grammatical distractors. Two
lists were created in which block ordering reversed for the first and last two blocks to
avoid the relative contribution of a particular sentence to the overall rating pattern.
6States represented include: AZ, CA, CN, CO, DL, GA, IL, IW, KS, LA, MA, MI, MN, MO,
MS, NC, NJ, NM, NV, NY, OH, OK, PA, SC, SD, TX, VA, WA
7A special thank you to William Snyder for sharing these materials.
Page 32
hidden
32
In addition, items within each block were randomized for each list to avoid within
block ordering effects.
Participants were presented one of the two lists using a projector and the presen-
tation software Keynote. Each sentence was projected individually and displayed
for 8 seconds before self-advancing to the next item. 8 Each sentence appeared on
one line, centered in new courier font. 9
Participants were instructed to read each sentence to silently and provide a “Yes”
rating for acceptable sentences and “No” for unacceptable sentences. Two examples
were provided in the instructions discouraging participants from using prescriptive
grammatical forms as a basis for acceptability. In addition, an effort was made to
highlight that the session was not designed as a memory test and participants were
encouraged to make independent assessments of the items presented.
Participants recorded their responses within a table found on the opposite side
of the questionnaire containing two columns corresponding to Yes and No and 50
rows for each sentence displayed. The session including consent, questionnaire and
rating lasted from 20 to 25 minutes.
The questionnaire and rating responses were entered by hand into a database
for organization and preparation for data analysis.
2.4.2 Results
Following the analysis provided in Snyder (2000), the first two (Set 1) and last two
(Set 2) response scores were grouped for each of the seven violation types in each
list. These sets served as the basis for relative rating change over the experimental
session and were used as the factor Set(First,Last) in the following analyses.
8The timing for slide display was tested on an independent group and found to be adequate for
the task.
9In contrast to Snyder’s original design, context sentences were not provided. Evidence from
Sprouse (2007a) suggest that contextualizing interrogative sentences does not condition accept-
ability judgments in a rating task. For this reason, and for matters of presentational consistency
with subsequent experiments context sentences were not given in the experiments that appear in
this thesis.
Page 34
hidden
34
Wa
nt fo
r
Wh
ethe
r Isl
and
Sub
ject I
sland
Tha
ttra
ce
Com
plex
NP
Adju
nct I
sland
Left
bra
nch Fille
r
All Participants
Num
ber
of C
han
ges
0
10
20
30
40
50
60 IncreaseDecrease
Figure 2.2: Number of participants with response changes of type ‘Increase’ (more
‘no’ to ‘yes’ responses) or ‘Decrease’ (more ‘yes’ to ‘no’ responses) by sentence type
in Experiment 1.
Results matched almost entirely with the Sign Test analysis.11
Here too ANOVAs and pairwise comparisons were conducted. A 2 x 7 (x 2)
ANOVA with Set(First, Last) and Type(Want for, Whether Island, Subject Is-
land, That-trace, Complex NP, Adjunct Island, Left branch) as a within-subjects
factors was conducted. List(1,2) was also included as a between-subjects block-
ing factor. The main effects for Set [F (1, 382) = 11.8, p < 0.001] and Type
[F (7, 1337) = 1119.8, p < 0.001] were significant as well as the interaction between
them [F (7, 1337) = 5.61, p < 0.001]. Simple effects for Set for each level of Type
were conducted. Again, Want for [F (1, 382) = 10.1, p < 0.01], Whether Island
[F1(1, 382) = 14.2, p < 0.001] and Subject Island [F (1, 382) = 4.8, p < 0.05] viola-
tion reached significance. No other sentence types reached statistical significance.
However, parametric measures such as ANOVA are not ideal for categorial
11Paired t-tests showed significant effects for Whether Islands and Complex NP violations and
marginal effects for Subject Islands (t(21) = 1.79, p = 0.88). ANOVA tests resulted in a main effect
of sentence type, and a pairwise contrast for Whether Islands and each of the four non-satiating
structures from the Sign Test.
Page 36
hidden
36
Type First set Last set
Filler 96.6 96.4
Want for 75.4 83.4
Complex NP 38.3 34.9
That-trace 28.3 33.2
Whether Island 25.6 37.6
Subject Island 22.9 29.3
Adjunct Island 12.2 11.7
Left Branch 2.9 2.4
Table 2.5: Percent acceptable by repetition for each sentence type ordered by overall
acceptability in Experiment 1.
Bottleneck hypothesis that Subjacency constraints are constraints on processing and
not syntactic constraints.
While this evidence shows some degree of support for the memory-based account,
other anomalous types that show more variable effects across studies, Want for,
That-trace and Complex NPs weakens Memory Bottleneck account for Satiation
effects. The current experiment found effects for Want for constructions. This
corresponds with evidence from Hiramatsu (2000) who also found effects for Want
for, in addition to That-trace effects (which were not found here). As noted in her
discussion, Comp-trace constructions (Want for and That-trace) have been noted
in the literature to be acceptable in some varieties of English (Lasnik, 1984; Sobin,
1987, 2002). This approach aims to explain the disparate findings from Snyder’s
results that found no Satiation effects for either Want for nor That-trace effects as a
sampling issue (differences in the diversity of the populations tested in Hiramatsu’s
study (U Connecticut) versus Snyder’s (MIT)).
However two issues complicate the notion that dialect differences explain con-
trasting Comp-trace findings. First, Want for constructions employed in Snyder, in
(15a), and subsequently adopted in Hiramatsu and the current experiment, are not
Comp-trace effects such as in (15b) and (15b) –the true Islands violations.
Page 42
hidden
42
but meaningful” sentences, in Table 2.7, as participants instructed to rate ‘mean-
ingfulness’.
Yesterday I the child a dog gave
The with feet aching man came yesterday home
Get me from the kitchen a big spoon
You can him not understand
Almost every Saturday she her house cleans
To me was interesting the movie
Table 2.7: Maclay and Sleator (1960)
The empirical problem with Equalization Strategy in its current form is that it
predicts more variability among Satiation studies than is attested. In this light, one
way to linguistically ground this hypothesis is to connect it to this apparent tendency
for participants to extract meaning from anomalous form. If interpretability is the
linguistic aspect that guides equalizing responses, then common effects found across
the experiments reported may have an explanation. Augmenting the Equalization
Strategy with a meaning-based criterion change – implemented consciously by par-
ticipants in response to the apparent mismatch with expected rating distribution
outcomes, makes inroads in accounting for the attested mismatch and variability as
reasoned strategies are bound to vary between individuals, and between population
samples.
Another aspect of these data that may provide some insight comes from the
fact that some types of anomaly in the original seven wh-extraction violations are
deviant in much more obvious ways than others. In essence, interpretability may
in fact be a function of the edit distance between an anomalous and licit syntactic
form. For example, Want for, That-trace and Whether Islands can be corrected by
replacement or omission of one-word. In contrast, Adjunct Islands and Left Branch
Condition sentences are not easily correctable in the same sense. Along these lines,
Hiramatsu noted that some participants had crossed out the “that” in That-trace
constructions on their response sheets. This indicates that it was clear enough to
some what was meant by the anomalous phrasal type that they felt compelled to
Page 44
hidden
44
2.6 Summary
To conclude, we began this chapter with a review of the evidence for the Satiation
Effect which mounted a case against the replicability of the effect. Against these
claims, findings here suggest that the Satiation Effect is replicable – albeit some-
what inconsistent. The quite consistent findings that separate Whether Islands and
Subject Islands on the one hand, from Adjunct Island and Left Branch Condition
structures on the other argue in favor of distinct sources for the surface acceptability
of these constructions. What does not corroborate the initial conclusions made by
Snyder that rating change reveals a class of syntactically grammatical structures
too difficult for working memory to process, is the fact that other sentence types,
such as Want for constructions, also show effects that do not demonstrate the same
hypothesized processing burdens.
The degree overlap in results between the various investigations argues against
an account that is based on test-taking strategies alone. A consideration introduced
here points out a potential qualitative difference between satiating and non-satiating
sentence types. Satiating types are typically more ‘interpretable’. Including inter-
pretability as a factor rekindles the possibility that the explicit, reasoned component
of judgment tasks provides participants the opportunity to change their response
criterion. Criterion changes over the course of the experiment in this way may de-
velop under particular pressures to equalize the distribution of rating responses to
50/50 in unbalanced designs in a non-spurious fashion.
The key question that remains is, what is the influence of non linguistic-based
strategies versus the contribution of linguistic processing on increased rating scores?
Chapter 3 aims to address these particular issues by taking a closer look at the
parallels between Satiation and a well-documented sentence processing phenomenon,
Syntactic Priming.
Page 45
hidden
45
CHAPTER 3
Repetition effects in sentence processing
This chapter aims to begin to tease apart the potential sentence processing and
task-based sources of the Satiation effect. As a first step, the parallels between Syn-
tactic Satiation and the more well-documented phenomenon Syntactic Priming are
explored. Evidence for automatic, implicit responses to exposure to syntactic form
have been demonstrated in a large, and growing research program. Robust priming
effects are found in language production, within and across languages. Yet there
are fewer studies on priming in language comprehension and those studies suggest
there are some complications in drawing a direct analog for effects in production to
effects in language comprehension.
One key point to consider for the present investigation is the observation that
there is a general lack of evidence for priming of anomalous syntactic structure
which, in turn, raises the question to what extent priming is contingent on licit
underlying forms. Another important facet is the degree to which ‘mere-exposure’
to syntactic form underlies the observed increase in rating scores or whether these
differential effects shown in the Satiation literature are particular to decision-based
exposure. Evidence from two experiments presented here argues for a non-judgment
based measure to gauge the influence of sentence processing changes as a function
of exposure in Satiation studies.
3.1 Syntactic Priming
People tend to repeat structure in spoken language, as in (1). Speakers may choose
to voluntarily repeat some aspect of language for stylistic reasons (i.e. emphasis) or
it may be unintentional.
Page 46
hidden
46
(1) a. Individual speaker
i. Once you’re in it, you can’t get out it.
b. Between speakers
i. I most certainly am not
ii. You most certainly am are
3.1.1 Language production
Bock (1986) first attempted to explore this phenomenon in an experimental set-
ting looking at syntactic structure in language production, in particular the Voice
alternation between active (2a) and passive (2b).
(2) Voice Alternation
a. The ball hit the boy. (Active)
b. The boy was hit by the ball. (Passive)
In a picture-description task in which a participant listens to a sentence prompt,
repeats the sentence and then describes an unrelated event in depicted in a picture,
Bock found that participants repeating an passive sentence were more likely to
describe the picture using an passive sentence as well.
Supporting evidence followed for other propositionally equivalent structures,
most notably for Dative alternations (3).
(3) Dative Alternation
a. The girl handed an apple to the teacher. (Prepositional Object)
b. The girl handed the teacher an apple. (Double Object)
These and subsequent experiments point to a truly syntactic component to prim-
ing effects – demonstrating that priming does not rely on verbal subcategory prop-
erties (theta-roles) (Bock and Loebell, 1990) nor the repetition of function words
(Bock, 1989). Furthermore, evidence also suggests that facilitory gains provided
Page 47
hidden
47
by recent exposure can withstand up to ten intervening unrelated syntactic objects
(Bock and Griffin, 2000).
Confirming both the observational and experimental evidence, a series of corpus
analyses also show that in naturalistic contexts speakers tend to repeat structures
that have used or comprehended in the recent discourse context (Bresnan, 2007;
Gries, 2005). These studies corroborate another key aspect of the experimental ev-
idence, namely that priming effects appear to be ‘long-lasting’, persisting despite
intervening syntactic material. Furthermore, corpus analyses also provide an op-
portunity to gauge the extent to which the distributional frequencies of particular
syntactic forms influences syntactic choice. Along these lines Jaeger and Snider
(2007) report that in addition to recent syntactic exposure speakers are sensitive to
the relative frequency of syntactic structure they are exposed to. Their data show
that low frequency structures appear to prime more than high frequency structures
(Inverse Frequency Effect).1
3.1.2 Language comprehension
Evidence for syntactic priming has also been found in sentence comprehension. Nop-
peney and Price (2004) looked at locally ambiguous early/late closure structures for
which there are (temporarily) two distinct parses. In general speakers tend to parse
the NP constituent ‘the stage’ as the object of the main clause verb (late closure
(4)) and disprefer structure in which the relevant NP is taken as the subject of the
preceding clause (early closure (5)).
(4) [Before the director had left the stage] the play began.
(5) [After the headmaster had left] the school deteriorated rapidly.
1Frequency effects in language processing refer to the observation that frequency of occurrence
of a linguistic object (lexical items (Scarborough et al., 1977) and sentences (Juliano and Tanen-
haus, 1994)) predicts ease of processing. The Inverse Frequency Effect refers to findings that low
frequency linguistic objects show more sensitivity (i.e. relative change) to recent exposure than
high frequency structures (Forster and Davis, 1984).
Page 48
hidden
48
This study found that reading times were faster and neural activity was attenu-
ated (fMRI) for dispreferred parses when prime sentences were structurally similar.
Arai et al. (2007) investigated the Dative alternation by recording eye-gaze times
in a visual-world paradigm. Their findings show that participants tend to direct their
gaze towards the picture of the recipient item after having DO constructions (6),
and towards the theme in PO constructions (7).
(6) The assassin will send the dictator the parcel. (DO)
(7) The assassin will send the parcel to the dictator. (PO)
However, in a follow-up experiment Arai et al. (2007) found that these effects
appear to be contingent on the repetition of the matrix verb –switching ‘send’ for
‘give’ in priming pairs eliminated the effect. A cross-experiment comparison sug-
gests that verb repetition too plays a role in production, enhancing effects. Taken
together, evidence points to differences in the robustness of detectable priming in
production versus comprehension.
There has also been evidence that exposure to novel syntactic forms can condition
priming. Kaschak and Glenberg (2004) found evidence that participants who read
text excerpts containing the ‘needs’ construction (8) showed faster reading times,
and higher accuracy on comprehension questions in a subsequent online experiment
than participants that had not been previously exposed to the novel construction,
rather the standard construction to convey the propositional meaning (9).
(8) Needs construction 2
a. The meal needs cooked.
b. The grass needs cut.
(9) Standard construction
a. The meal needs to be cooked.
b. The grass needs to be cut.
These findings were taken as evidence that adult speakers of a language are
capable of learning to comprehend new constructions in their native language.
Page 52
hidden
52
of exposure on familiarity – indirectly measured by change in rating scores. This
approach did not aim to assess the possibility of obtaining increased rating scores
for strictly ungrammatical stimuli. Rather the semi-acceptable status of the target
items was necessary only in terms of the logistics of providing room for rating
improvement. In fact, the investigators did not include ungrammatical distractor
items in the exposure task in a deliberate attempt to avoid drawing attention to
the grammatical status of the target items. The goals of Satiation work is quite
different: the grammatical status of the test items is the object of inquiry.
Sprouse (2007b) suggests that this difference is of key importance in regard
to claims about the source of the Satiation effect and its (non)linguistic basis. He
argues that Luka and Barsalou’s items are not ungrammatical in the technical sense,
as is the case in Satiation studies, and that only grammatical structure can show
syntactic priming effects. This position emphasizes the distinction between types
of anomaly and asymmetrical effects in syntactic priming – assuming that syntactic
priming, and facilitated processing can only stem from licit syntactic structure.
The prediction, again, is that Satiation effects attested in the literature stem from
unbalanced experimental designs and therefore is inherently a non-linguistic effect.
Results from Experiment 1 challenge this stance on grounds that the effect should
show more instability than is actually attested. However the possibility that uneven
balance in rating tasks contribute to Satiation effects cannot be excluded without
addressing this issue directly.
Another key prediction that stems from literature in Syntactic Priming concerns
the nature of the exposure condition. One defining characteristic of studies on
the Satiation effect is that they have exclusively used rating tasks as the exposure
task and the dependent measure. Contrastively, priming literature has primarily
employed auditory and/or reading tasks as exposure sessions. In fact, Luka and
Barsalou purposely avoided exposure to clearly ungrammatical items in the exposure
task in order to avoid systematic engagement of structural evaluation strategies
on the part of participants. This raises the question, hitherto unexplored in the
Satiation literature, – are repetition effects found in exposure contexts that do not
Page 55
hidden
55
Table 3.2: Target Items for experiment 2.
That-trace Who does Mary think that likes John?
Subject Island What does John know that a bottle of fell on the floor?
Complex NP Who does Mary believe the claim that John likes?
Adjunct Island Who did John talk with Mary after seeing?
Left Branch How many did John buy books?
instances of each of seven violation types to correspond to five repetition frames.
The materials here included eight instances, reflecting an increase in the number
of exposures by three. Furthermore, the original materials included a number of
potentially salient properties that were modified for the current set of items to
follow standard practices in psycholinguistics; in particular the use of uncommon
proper names and the uneven distribution of Wh-words employed.
Proper names in the previous item list were replaced with more common names
for the participant age range. These names were harvested from the 50 most popular
names registered with Social Security from 1985. Previous materials included an
uneven scattering of What, Who, How many, Where and Why (23 What, 14 Who,
7 How many, 4 Where, 1 Why). Here That-trace, Subject Island, Complex Noun
Phrases and Adjunct Islands were balanced between What and Who question words.
Left Branch violations always employed How many. Grammatical controls were also
balanced for question word in an equal proportion to target items (32 What, 32 Who,
16 How many). An effort was also made to distribute the main clause verbs across
sentence types.
Items were grouped into eight ten-item blocks. Each block of ten items included
an instance of each of the five violations together with five grammatical sentences.
Eight presentational lists were created by distributing the repetition frame in which
individual sentences appeared. Participants were assigned randomly these presen-
tation lists. Repetition blocks were randomized by computer, with a different order
for each subject in order to avoid within-block ordering effects.
Participants were asked to give acceptability ratings for the 85 interrogative sen-
tences. Each participant was assigned to one of the eight presentation lists which
Page 56
hidden
56
was presented on a PC computer. The experimental software DMDX (Forster and
Forster, 2003) was used to coordinate the presentation of the items and to record
participant responses. Each sentence was displayed individually, on one line, cen-
tered on the monitor and in new courier font. A default timeout was set at 30
seconds in order to give the respondent ample time to respond.7
The instructions for the experiment were contained within the experimental ses-
sion. The main components of these instructions mirrored those used in the previous
experiment.8 However, the example sentences were excluded from the instructions
themselves in lieu of practice items appended to the beginning of the main exper-
imental session.9 Participants were instructed to read each sentence and press the
Yes button for acceptable sentences and the No button for unacceptable sentences.
The session including consent, questionnaire and rating lasted from 15 to 25 minutes.
The questionnaire responses were entered by hand and response data by PHP
script into a database for organization and preparation for data analysis. Both
written and electronic data were connected by a participant code assigned at the
beginning of the session and recorded on the questionnaire form and the experimen-
tal output file as an ID.
3.5.2 Results
Participants’ overall ratings for each sentence type can be found in Figure 3.1.
The data were analyzed by both parametric (analysis of variance) and non-
parametric (G2 logistic regression analysis of deviance) statistical procedures.
Relevant factors for both analyses were exposure Set(First,Last)10 and sentence
Type(That-trace, Subject Island, Complex NP, Adjunct Island, Left Branch Con-
7In fact, most participants never reached the timeout threshold and were otherwise unaware
that there was a time limit imposed by the software
8These instructions appear in the appendix
9These practice items contained two violation types found in experiment 1 but not part of
experiment 2 in order to avoid an extra repetition exposure for these particular sentence types
10First set included repetitions 1 and 2, and the last set repetitions 7 and 8.
Page 57
hidden
57
Fille
r
Tha
ttra
ce
Sub
ject I
sland
Adju
nct I
sland
Com
plex
NP
Left
Bra
nch
Overall Means
Per
cen
t Ac
cep
tabl
e
0
20
40
60
80
100
Figure 3.1: Overall mean acceptability for Experiment 2.
dition)11. The List(1-8) factor was used as a blocking factor in factorial tests and is
not reported.
An overall 2 x 5 ANOVA was conducted revealing main effects for Set [F (1, 32) =
4.8, p < 0.05] and Type [F (4, 80) = 117.9, p < 0.001] with no interaction between the
two. Planned comparison tests for Set for each sentence type revealed a difference
between first and last set exposures for Subject Island [F (1, 32) = 4.9, p < 0.05]
violations. Results from logistic regression measures confirm the parametric results
(Subject Island violations (G2 = 4.0, p < 0.05)) and no other effects for the other
four violation types were found.
Rating stability was calculated by counting the number of participants that split
yes/no ratings in both the first and last repetitions versus the number of participants
that responded yes or no throughout the first and last set repetitions. These results
are shown in Figure 3.3. The differences between these counts were submitted
to Sign Test for each of the five levels of Type. Results found that Left Branch
Condition (p = 0.01) and Complex NP (p = 0.02) sentence types were more stable
11Note only target ungrammatical types were considered in the statistical analysis
Page 60
hidden
60
NPs. In the unbalanced context in Experiment 1, Complex NPs were found more
acceptable than the other four types reported here but in Experiment 2 Complex
NPs were judged less acceptable – on par with Left Branch Condition violations.
Stability measures provide another source of detecting the effect of uneven design
contexts. Results from Experiment 1 found Complex NPs the most unstable over
the course of the experiment whereas the current experiment Complex NPs were
much less variable. Thus, task balance may in fact play a role by contributing to a
generalized instability in rating responses in the experimental session.
In short, the evidence from Experiment 2 suggests that Satiation effects are
observable in balanced experimental designs, lending credence to the idea that rep-
etition effects found in Satiation studies are related to sentence processing (priming
or a memory bottleneck), and not task-bias effects. Contrasting evidence from
unbalanced and balanced designs, however, does provide some support for the hy-
pothesis that unbiased contexts do influence rating responses to some degree – just
not a direct and clear influence on increased rating responses as put forward by
the Equalization Strategy. The next step, explored in Experiment 3, is to address
to what extent Satiation effects can be separated from the judgment process itself.
If rating improvement is an index of syntactic priming then effects should also be
obtained through reading-based exposure.
3.7 Experiment 3
3.7.1 Methods
Participants
Twenty-eight undergraduate students participated in the experiment. Undergrad-
uates were students from the University of Arizona and were either enrolled in an
introductory psychology course and received course credit for their participation.
Participants completed a language background questionnaire aimed at assessing
their proficiency in English and exposure to other languages. Based on the ques-
tionnaire data, 10 participants were excluded for proficiency in a language other
Page 62
hidden
62
The ...
gone dog
chased sink
our hosed
into. cat.
Figure 3.4: Illustration of the Maze task (presented one line at a time).
blocks were used in the final analysis.
3.7.2 Results
Response were evaluated by the same factorial groupings and statistical measures
as in Experiment 2.15 Means for the relative acceptability for each sentence type
can be seen in Figure 3.5.
An overall ANOVA considering the factors Set and Type was conducted. Results
reveal main effects for Type [F (4, 170) = 43.7, p < 0.001] and not for Set. One-way
ANOVAs revealed no differences in for the mean acceptability of the sentence types
tested. Similarly, G2 measures showed no differences as well. In order to assess the
effect of exposure on rating responses, planned comparisons were run for Set by each
sentence type. No significant effects were detected.
Given first and last set scores only contained one rating score rating stability
was not calculated. The number of participants for increased, decreased and stable
15Note that the factor Set with the levels ‘First’ and ‘Last’ are synonymous with the first and
eighth repetition exposures given that the second through seventh exposures took place in the
reading task and no rating score was taken.
Page 64
hidden
64
That-t Subject Island Complex NP Adjunct Island Left Branch
Increase 6 4 1 3 1
Decrease 5 4 3 6 2
Stable 7 10 14 9 15
Table 3.4: Rating stability: number of increased, decreased and stable responses in
Experiment 3.
judgment patterns appear in Table 3.4
3.8 Discussion
Results from Experiment 3 found no significant effects for rating change in a reading-
based exposure task. This finding, on the surface, lends favor to the account that
Satiation effects are particular to rating-based exposure and potentially not driven
by mechanisms underlying Syntactic priming. On the other hand it may be the case
that the Maze task is not an adequate exposure task to sustain syntactic priming
effects. Although the task has been proven to detect processing differences (i.e. sub-
ject/object relative clause asymmetries Forster et al. (2009)), it is not clear that the
relevant processing mechanisms are engaged that support syntactic priming when
comprehension of sentences is word-by-word, as this type of comprehension is not
‘natural’ in any typical sense. It is also plausible that a decision task that incorpo-
rates non-word options further complicates typical comprehension from proceeding
naturally. Moreover, it also appears to be the case that evidence from syntactic
priming in comprehension has been somewhat more elusive to detect in general
compared to evidence from production – varying from task to task, measure to
measure and often relying on lexical repetition to show robust effects.
Other evidence that the Maze task is not a well-suited exposure task to detect
syntactic priming effects comes from the distribution of overall mean ratings for
Experiment 3, seen in Table 3.8. Rating means for Experiment 3 drop instead of in-
crease, highlighting a potential unexpected interaction between the reading exposure
Page 66
hidden
66
qualitative differences between more readily interpretable satiating types and less
interpretable non-satiating types. Again here in Experiment 2 there is corroborating
evidence for a distinction between satiating types based on interpretability. How-
ever the current evidence for Satiation comes from a balanced experimental design.
If meaning is a relevant factor, it appears to be applied under repetition in general
regardless of the biased ration between grammatical and ungrammatical test items.
3.10 Summary
The goal of this chapter was to begin to isolate the potential sources of the Sati-
ation effect. Evidence from two experiments directed at addressing the potential
task-based and processing-based influences on rating change were conducted, but
revealed mixed results. On the one hand a significant Satiation effect was found
in a balanced-design rating task. This finding suggests that task bias alone cannot
explain improved rating as a function of exposure – however results do suggest that
bias may contribute to a generalized instability across sentence types. A second ex-
periment aimed at assessing the extent to which rating change is dependent on the
repeated rating exposure and the judgment process in general found no significant
change in acceptability for any of the five sentence types tested. Considering the
data from overall acceptability for the experimental session, the adequacy of the
word-by-word task to serve as an exposure task for Satiation effects is challenged.
This result, in tandem with little evidence supporting robust priming effects in sen-
tence comprehension in general suggests that exposure type plays a critical role in
producing detectable effects.
In sum, the evidence from this chapter argues for a corroborating source of
evidence to distinguish the relative contributions of the decision-making process
and language processing on the Satiation effect. Chapter 4 aims to fill this gap by
coordinating Satiation effects in an offline rating task with data from online reading
time measures.
Page 68
hidden
68
surface (un)acceptability in (1) derives from an assessment of grammatical well-
formedness.
(1) a. Colorless green ideas sleep furiously.
b. * Furiously sleep ideas green colorless.
What is less clear is when the grammatical status of a linguistic structure is
obscured by some other factor or factors other than syntactic well-formedness.
Experimental syntax proponents suggest that some of these factors can be mit-
igated with controlled data collection techniques. Typical confounding factors pur-
ported to be diminished include task, and informant variability. For example, the
presentation order of sentences to be evaluated can have an influence on acceptabil-
ity ratings. Greenbaum (1976) found that when the sentence in (4) was judged in
coordination with (2) it was consistently found more grammatical than with (3).
(2) We didn’t dare answer him back.
(3) We dared not answer him back.
(4) We didn’t dare to answer him back.
This indicates that structural comparison between test items can influence the
evaluation process. Experimental approaches attempt to reduce the effects of sen-
tence ordering by implementing counter-balancing and randomization techniques
typical in group testing practices. This approach aims to distribute potential effects
that arise due to direct comparisons across all participants.
Another type of variability found to affect acceptability ratings deals with the
mental state of informants. Carroll et al. (1981) found that participants that pro-
vided judgments under ‘objectively aware’ (viewing oneself as others may) conditions
tended to use meaning and communicativeness as a criterion to assess grammatical-
ity, whereas participants that were ‘subjectively aware’ (viewing oneself as an unre-
flective participant) assessed linguistic anomaly in response to their structural prop-
Page 70
hidden
70
ity associated with language processing. However, two alternative approaches raised
in the light of investigations on Satiation effects propose that experimental method-
ologies may inadvertently introduce, instead of reduce, confounding factors and that
these factors have been misinterpreted as evidence for detecting processing effects
in judgment tasks. One potential factor, noted in the Equalization strategy hypoth-
esis, concerns the decision-making process and the influence of experimental design
in testing conditions. Sprouse (2009) notes that the majority of Satiation studies
have implemented designs in which the experimental materials were predominately
anomalous. Based on a series of replication studies in which no Satiation effects are
found in balanced designs, he proposes that Satiation effect reflects a task-based
strategy in which participant attempt to equalize the number of ‘yes’ to ‘no’ re-
sponses across the entire experimental session.
This hypothesis is of key importance to assessing Satiation effects for the current
investigation, but also makes a more general claim about experimental approaches
to syntax, namely that the particular configuration of an experimental design itself
can systematically alter rating responses.
A second potential confounding factor concerns the informants typically re-
cruited to participate in experiments. Arguments have been made that in order
to avoid experimenter bias, acceptability judgments should not performed by lin-
guists themselves. As such, experimental syntax has been typically based on data
from naive participants. This of course has its advantages, beyond the mitigation
of experimenter bias influences (Hawthorne Effect), by reducing the aforementioned
individual differences inherent to human subjects. However, naive participants –
untrained in syntax, by definition – appear to be susceptible to confusing struc-
tural well-formedness for meaningfulness, or typicality (Maclay and Sleator, 1960).
As noted in Chapter 2, on the surface those sentence structures that have shown
Satiation effects in the literature are more readily interpretable than non-satiating
structures. The Interpretability approach raised here suggests that participants ac-
tively or passively adopt a meaning-based criterion despite formal instruction to
evaluate the structural properties of sentences as a function of repeated exposure.
Page 72
hidden
72
(6) Adjunct Items
a. When did Fraser clean the bathroom while Judy mopped the kitchen?
b. What did Fraser clean the bathroom while Judy mopped on Saturday?
Participants read a context sentence and then pressed a button to get either the
violation or control sentence (counter-balanced across subjects). Half of the sentence
sets presented contained a comprehension probe to assure that participants were
reading for comprehension. Results showed that Whether Island showed differential
reading times compared to controls as a function of exposure, whereas Adjuncts
experienced no change. These findings suggest that Satiation effects do have a
processing correlate.
Braze frames these results in terms of memory limitations, and underlines the
notion that differential responses to repeated exposure indexes processing asymme-
tries – along the lines proposed by Snyder (2000). However, considering the mixed
findings from Satiation studies reviewed in this thesis it is difficult to determine
if improved ratings reported for all satiating sentence types stem from processing
burdens. Want for and That-trace effects have no clear processing account yet still
show effects in various studies. Furthermore, if the influence of exposure is process-
ing based, then the question arises, why don’t we find consistent Satiation effects for
Complex NPs and to some extent Subject Islands in experiments that test them?
Although Braze’s experiment makes inroads to connecting judgment-based Sa-
tiation to differential effects from measures more typically employed to gauge pro-
cessing effects, this study does not coordinate the same participants in both studies.
As it stands if Satiation effects are found in a rating task, and independently found
in a reading task it does not follow that the same factors stem both effects. It
is potentially the case that some effects that influence ratings in a judgment task
and/or a reading task are task-specific. Coordinating the same participants would
provide an important perspective on effects in a rating task and their subsequent
effects in a reading task.
Page 73
hidden
73
4.4 Summary and predictions
Considering the multi-faceted nature of acceptability ratings a clear understanding
of the source of improved ratings as a function of exposure is difficult to attain from
the current methodologies (ratings as exposure and measure). Given the claims
regarding the source of the Satiation effect it makes intuitive sense to assess both
the influence of task bias on ratings in coordination with measures gauging the
influence of bias versus exposure through measures more typically applied to gauge
processing effects. Experiment 4 and 5 aim to fill this gap by conducting a two-
part experiment in which the same set of participants provide data for rating and
self-paced reading tasks.
If task bias is an operative characteristic of the Satiation effect, ungrammatically
biased contexts should produce more Satiation effects than grammaticality biased
contexts. Evidence for a strong influence of experimental design was mitigated in
Experiment 2, but given the detected effect on stability there is reason to suspect
that bias can attribute more variability overall in an experimental session – and as a
consequence produce more Satiation effects than non-bias and reverse-bias contexts.
If the ability for participants to interpret certain sentences drives Satiation effects,
then, all else being equal, exposure, not task balance should produce improved rating
scores.
Experiment 4 aims at assessing the relative contributions of these processes on
rating scores but also serves as the exposure task for a subsequent self-paced read-
ing task (Experiment 5). Claims that Satiation reveals processing asymmetries that
stem from syntactic priming predict detectable reading time asymmetries for satiat-
ing versus non-satiating sentence types as a function of exposure, regardless of bias
context. The Memory Bottleneck account makes an even more specific prediction in
that only those sentence types hypothesized to be anomalous due to working mem-
ory demands should show differential processing effects (Subjacency violations).
Page 75
hidden
75
Coordinate Constraint Who did Kate meet new member and for dinner?
Needs Construction What does David say Amber believes needs cut?
Infinitive Subject Who was to tell the results at the exam cruel?
Table 4.1: Additional violation types in Experiment 4
Set A Set B
That-trace Want for
Complex NP Whether Island
Adjunct Island Coordinate Constraint
Grammatical filler (x 7) ...
Table 4.2: Grammatical bias
Each treatment list was balanced for repetition frame which created 20 pre-
sentation lists. Sentences within each frame were randomized automatically on
presentation to avoid potential ordering effects.
Items that appeared in previous experiments were reused here with minor mod-
ification. The length of the sentences was modified for each sentence type in order
to assure that the reading windows explored the subsequent self-pace reading task
were of equal number. An effort was made to distribute the tense of the auxiliaries
used (i.e. do, did, will). New items also conformed to these modifications and other
characteristics take into account in previous materials creation.
Participants were asked to give acceptability ratings for the 55 interrogative
sentences. Each participant was assigned to one of the 20 presentation lists. The
experimental software DMDX (Forster and Forster, 2003) was used to present items
and record participant responses. Each sentence was displayed individually, on one
line, centered on the monitor and in new courier font. A default timeout was set at
30 seconds in order to give the respondent ample time to respond.2
The instructions for the experiment were contained within the experimental ses-
sion. The main components of these instructions mirrored those used in the previous
2In fact, most participants never reached the timeout threshold and were otherwise unaware
that there was a time limit imposed by the software.
Page 77
hidden
77
Fille
r
Wa
nt F
or
Tha
ttra
ce
Adju
nct I
sland Whe
ther
Com
plex
NP
Coo
rdin
ate
Overall means for the grammatical bias
Per
cen
t Ac
cep
tabl
e
0
20
40
60
80
100
(a) Grammatical
Fille
r
Wa
nt F
or
Tha
ttra
ce
Adju
nct I
sland Whe
ther
Com
plex
NP
Coo
rdin
ate
Overall means for the ungrammatical bias
Per
cen
t Ac
cep
tabl
e
0
20
40
60
80
100
(b) Ungrammatical
Figure 4.1: Overall acceptability for bias conditions in Experiment 4.
ous experiments.
A 2 (Bias) x 7 (Type) ( x 2 (List)) ANOVA revealed main effects for Bias
[F (1, 138) = 23.6, p < 0.001] and Type [F (10, 759) = 220.6, p < 0.001] with no
interaction between them. Follow up comparisons by Bias for each sentence type
was conducted. Results showed that That-trace [F (1, 138) = 4.0, p < 0.05], Whether
Islands [F (1, 138) = 6.9, p < 0.01] and grammatical fillers [F (1, 138) = 6.2, p < 0.05]
differed significantly in grammatically and ungrammatically biased contexts with
both Want For [F (1, 138) = 3.8, p = 0.05] and Complex NP [F (1, 138) = 3.5, p =
0.06] violations showing marginal effect. Deviance scores from a G2 test also found a
a contrast for That-trace (G2 = 223.4, p < 0.05), Whether Islands (G2 = 173.2, p <
0.01) and grammatical fillers (G2 = 1563.0, p < 0.05) as well as the marginal effects
found for Want For (G2 = 214.9, p = 0.07) and Complex NP (G2 = 114.3, p = 0.06)
sentence types. No other contrast was found by either measure.
Turning to the influence of repeated exposure, a 2 (Set) x 7 (Type) ANOVA
was conducted collapsing over bias lists and then grammatical and ungrammatical
bias conditions independently. The results for all bias contexts combined show main
Page 80
hidden
80
Tha
ttra
ce
Com
plex
NP
Adjun
ct Isla
nd
Wan
t for
Whe
ther
Isla
nd
Coo
rdian
te
Stability
Num
ber
of S
table
/Var
iable
resp
onse
s
0
5
10
15
20 StableVariable
(a) Grammatical bias
Tha
ttra
ce
Com
plex
NP
Adjun
ct Isla
nd
Wan
t for
Whe
ther
Isla
nd
Coo
rdian
te
Stability
Num
ber
of S
table
/Var
iable
resp
onse
s
0
5
10
15
20 StableVariable
(b) Ungrammatical bias
Figure 4.2: Stability measures for bias conditions in Experiment 4.
4.6 Discussion
Evidence for Satiation effects was found in both grammatical and ungrammati-
cal bias conditions. This evidence suggests that even under conditions that are
hypothesized to influence participants to reduce acceptability as a function of ex-
posure (not increase), Satiation effects are observed. Notably, strong effects are
found for Whether Islands in both ungrammatical and grammatical bias conditions.
Whereas Want for constructions show marginal effects in both conditions, they are
also highly grammatical overall suggesting Want for sentences experience a ceiling
effect in which ratings are capped by the overall acceptability, rather than by weak
Satiation effects. Taken together, Satiation effects do not appear to be directly
contingent on task bias.
The evidence for an interpretability account for Satiation effects is also limited.
First, although Whether Islands, Want for and Complex NP Islands are qualitatively
interpretable, Coordinate constraint violations are not. Furthermore, it is not clear
why effects are not found for That-trace constructions, which appear to be more
Page 81
hidden
81
meaningful than Coordinate constructions. Secondly, an interpretability approach
as framed, predicts exposure to be the key factor in producing Satiation effects.
However, of the four sentence types that show Satiation patterns only two (Whether
Islands and Want for violations) occur in both bias contexts, suggesting task bias
does play a role in rating improvement to some degree.
A point of consistency, both bias conditions show the same relative ordering in
overall scores. Considering the findings in Experiment 1 and 2 in which relative
acceptability appeared to be affected by task bias, participants in Experiment 5
responded quite consistently despite the bias mismatch (Figure 4.1).
A clear difference between bias conditions, participants tended to rate anomalous
sentences higher generally in ungrammatical contexts. Despite this overall asymme-
try the increased rating is distributed across the entire experimental session and does
not arise exclusively in the second portion of the task (Table 4.5.2). Furthermore,
neither bias condition showed a difference in the number of sentence types that were
stable/unstable, in contrast to findings from Experiment 1 and 2. However, task
bias does appear to affect rating stability to some degree.
The most striking results pertain to the unexpected Satiation effects for Complex
NP and Coordinate Constraint forms. On the one hand, Complex NP violations
have shown effects in previous studies – however not in the current work. But un-
expectedly, effects are attested in grammatical contexts, and not in ungrammatical
contexts. This finding is odd from the standpoint of consistency between experimen-
tal sessions. If a sentence type satiates in one condition and not in another chances
are the effect stems from more variable participant related factors and not from
exposure alone. Furthermore, this finding does not bode well for the case for bias-
induced satiation effects in that Equalization predicts effects only in ungrammatical
condition sets – against current findings.
On the other hand, Satiation effects for Coordinate constraint constructions is
not predicted on any account. Coordinate forms were selected in this manipula-
tion to pair against another decidedly syntactically anomalous structure (Adjunct
Islands). Overall acceptability for Coordinates, and for Complex NPs for that mat-
Page 84
hidden
84
After completing the rating portion of the experimental session, participants
were asked to read 60 interrogative sentences on a computer screen. Participants
were randomly assigned to one of the 5 presentation lists which was displayed on
a computer monitor and coordinated by the experimental software DMDX (Forster
and Forster, 2003). The words in each sentence initially appeared with each char-
acter replaced by an asterisk (*). Pressing the RIGHT button on the button box
revealed the sentence two words at a time. Yes/No comprehension questions also
appeared pseudorandomly, to which the participant was asked to press YES or NO
(also corresponding to LEFT and RIGHT buttons). Target sentences and comprehen-
sion questions appeared individually, on one line, centered and in new courier font.
This portion of the experimental session lasted from 10 to 15 minutes.
Responses from the questionnaire completed before the rating task, the rating
task data and the reading task data were entered into a database for organization
and preparation for data analysis. Both written and electronic data were connected
by a participant code assigned at the beginning of the session and recorded on the
questionnaire form and the experimental output file as an ID.
4.7.2 Results
Response times (RTs) collected were prepared for analysis in the following manner.
First, error rates were calculated for each participant. Participants whose error rates
for comprehension questions exceeded 20% were excluded from the data analysis.4
Second, outliers were adjusted by replacing all RTs that were +/-2 standard devi-
ations from each participant’s own mean reading time for critical regions with the
value of either 2 standard deviations above or below. Thus, the high/low cut-off
points were dynamic and varied according to individual reading patterns.
These scores were then transformed into residual reading scores. Following stan-
4As noted in section 4.7.1, this is the same error rate used to exclude participants in Experiment
4, the exposure task for the current experiment. Rating data and RTs were collected and analyzed
for the same participants. Thus, only data from participants that participated in both sessions at
a minimum level of engagement were used for analysis.
Page 89
hidden
89
terms of corroborating effects between rating and reading tasks both processing
accounts fair about the same on the surface. The Memory Bottleneck account cor-
rectly predicts effects in both rating and reading tasks for Subjacency violations
(Whether Islands and Complex NP structures) but not the effects found for Want
for and Coordinates.
A Syntactic Priming interpretation of Satiation effects can account for Whether
Islands, Complex NPs and Want for structures if we assume that these structures
are grammatically well-formed and 6 are marked in terms of familiarity, influencing
initial acceptability ratings and recent exposure increases familiarity of the structure.
An explanation for Coordinates cannot proceed in the same way. Coordinates are
robustly unacceptable, and cannot plausibly be viewed as syntactically well-formed
under any syntactic theory.
4.9.3 Residual issues
In light of the discussion of the current findings, there are two main questions that
remain: 1) What drives the apparent mismatch in satiating types between bias
conditions in rating tasks? 2) What underlies Satiation effects in rating and reading
tasks for Coordinate constructions?
Rating task
The first step in addressing these issues is to address the question of why a number
of participants find Coordinate constructions acceptable on repetition. The position
advocated here: Coordinates constructions are interpretable. On the surface this
claim appears unfounded: Coordinates are not interpretable. There is a clear dif-
ference when considering the extent to which Coordinates (9), and other satiating
types, such as Whether Islands (10) are readily interpreted as meaningful.
6The assumption made here is that if Whether Islands and Complex NPs show Satiation effects
in both rating and reading tasks independent of bias condition – one of the two processing accounts
for Satiation of both which claim that effects are predicated on licit syntactic structure, must be
operative.
Page 92
hidden
92
All else being equal, grammatical bias should also provide sufficient context for That-
trace constructions to be recognized as correctable, but the evidence here suggests
that, if this does occur, it does not translate into improved rating scores.
A particular characteristic of the experimental design that is different for Coor-
dinates and That-trace structures concerns the other sentence violation types con-
tained in the presentation list. Coordinates were one of three target structures in
Set B that appear to be deviant by one word. Conversely, That-trace structures are
the only structure in Set A that are one-word correctable. In essence a correctability
strategy is ‘bootstrapped’ by contextual influence given a number of other anoma-
lous sentence types are correctable in similar ways. On this view, a coordination
of task bias, and composition of the other sentence materials in the exposure set
induces pressure to find correctable alternatives of a certain type.
Reading task
If the interpretable-by-correction approach is on the right track, we have an explana-
tion for why Satiation is found for Coordinates, only in ungrammatical bias contexts
and not found for That-trace effects regardless of bias context. However, an inter-
pretability account is hypothesized to be an explicit process in which a combination
of task bias and reasoning-based strategies encourage participants to respond more
favorably to correctable sentence types over the course of the experimental session.
Consequently no effects are predicted to carry over to a reading task. Nevertheless,
all structures proposed to satiate due to correctability show reading time effects as
well. But how does the apparent connection between rating and reading time mea-
sures fit into the larger picture of repetition effects and mitigated memory burdens?
A first step to address the conflict is to look at the sentence types that undergo
satiation in grammatical bias conditions. Key to the interpretable-by-correction
approach is external pressure to search out alternative structures – ungrammatical
bias provides the rationale but the grammatical bias condition does not. Whether
Islands and Complex NPs satiate in the grammatical bias condition, suggesting
that a correction bias is not the source of these findings. Viewed in this way, rating
Page 97
hidden
97
findings highlight the influence of at least two distinct processing effects underlying
rating improvement in judgment tasks.
These robust findings suggest judgment tasks, performed experimentally, can de-
tect differences in sources of anomaly. However, evidence from variable findings for
some sentence constructions across judgment tasks and null results for reading-based
exposure complicate the hypothesis that rating improvement in judgment tasks can
be used as diagnostic test to separate syntactic constraint from processing con-
straint. Two potential accounts for the varied replicability found for Wh-violations
in Satiation studies suggest that increased rating scores arise as artifacts of partic-
ular features of some rating task designs. Sprouse (2007d) proposes that Satiation
effects arise as a by-product of ungrammatically-biased experimental designs that
bias participants to accept sentences more readily in the later-half of the experi-
ment in an attempt to ‘equalize’ the total number of ‘yes’ to ‘no’ responses across
the entire session. A test-based approach such as this one makes inroads in explain-
ing mismatches in findings across Satiation studies and does appear to covary with
rating (in)stability (in particular for Complex NP violations), yet as an inherently
non-linguistic strategy more variability is predicted than is found.
A novel observation documented in this thesis is that consistent satiating and
non-satiation types show asymmetries in overall interpretability. To some extent
satiating types appear to be more readily ‘correctable’ and/or ‘interpretable’ than
non-satiating types. Initially introduced as a way to restrict the predicted variability
of the Equalization strategy, the Interpretable-by-correction approach also incorpo-
rates another source of variability associated with judgment tasks: despite explicit
instruction participants tend to confuse ‘communicative’ with ‘grammatical’.
Satiation effects found in Experiment 4 and 5 for Coordinate Constraint struc-
tures, robust syntactic violations, suggest that participants responses can be influ-
enced by item-based biases; strategies developed according to the particular char-
acteristics of the sentence types that appear in the testing session. This type of
confound may work independent and/or in conjunction with test-based strategies
such as the Equalization strategy to produce unexpected rating changes over the
Page 99
hidden
99
That
trace
Adjunct
Island
Comp
lex NP
Ungrammatical Bias
Mean
Acce
ptabili
ty
0
20
40
60
80
100
(a) Experiment 4b
That
trace
Adjunct
Island
Comp
lex NP
Balanced
Mean
Acce
ptabili
ty
0
20
40
60
80
100
(b) Experiment 2
That
trace
Adjunct
Island
Comp
lex NP
Grammatial Bias
Mean
Acce
ptabili
ty
0
20
40
60
80
100
(c) Experiment 4a
Figure 5.1: Relative contrasts across experimental tasks.
wrongly interpreted as syntactically relevant. Whereas typical experimental ap-
proaches to syntactic inquiry explore the overall mean differences in acceptability
between syntactic structures, Satiation studies provide a unique perspective on the
relative differences between initial and final ratings for participants. As seen here,
there is a potential that overall differences in relative acceptability are colored by par-
ticular interactions between task, material and participant-based non-grammatical
biases that may develop over the course of the experiment. Focusing solely on the
overall acceptability scores, in (5.2a) or (5.2c) in Figure 5.2 for example, misses a
potentially revealing difference between the stability of the rating scores as a func-
tion of exposure, seen in (5.2b) and (5.2d) respectively. What is more, finding a
significant difference, or not, between these two sentence types may be contingent
on task design (explored here) and not exclusively on the underlying grammatical
status of these constructions.
Findings from experiments in this dissertation point to test and item-related
biases as active extra-grammatical confounds in judgment tasks. Furthermore, this
evidence suggests that acceptability data are not reliable without a better under-
standing of extra-grammatical influences on rating scores and how these strategies
interact in the testing session. This assertion should not be viewed, in the author’s
opinion, as a point against the Experimental Syntax program. Rather, the identi-
Page 100
hidden
100
Adjunc
t Island
Whet
her Is
land
Grammatical Bias
Mean
Acce
ptabil
ity
0
20
40
60
80
100
(a) Overall scores
Adjunc
t Island
Whet
her Is
land
Initial RatingFinal Rating
Grammatical Bias
Mean
Acce
ptabil
ity
0
20
40
60
80
100
(b) Change scores
Adjunc
t Island
Whet
her Is
land
Ungrammatical Bias
Mean
Acce
ptabil
ity
0
20
40
60
80
100
(c) Overall scores
Adjunc
t Island
Whet
her Is
land
Initial RatingFinal Rating
Ungrammatical Bias
Mean
Acce
ptabil
ity
0
20
40
60
80
100
(d) Change scores
Figure 5.2: Overall and change scores for acceptability judgments.

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Already have an account? Sign in

Readership Statistics

3 Readers on Mendeley
by Discipline
 
by Academic Status
 
33% Post Doc
 
33% Ph.D. Student
 
33% Assistant Professor
by Country
 
100% United States