Using working memory theory to investigate the construct validity of multiple-choice reading comprehension tests such as the SAT.
- PubMed: 11409100
Abstract
When taking multiple-choice tests of reading comprehension such as the Scholastic Assessment Test (SAT), test takers use a range of strategies that vary in the extent to which they emphasize reading the questions versus reading the passages. Researchers have challenged the construct validity of these tests because test takers can achieve better-than-chance performance even if they do not read the passages at all. By using an individual-differences approach that compares the relative power of working memory span to predict SAT performance for different test-taking strategies, the authors show that the SAT appears to be tapping reading comprehension processes as long as test takers engage in at least some reading of the passages themselves.
Using working memory theory to investigate the construct validity of multiple-choice reading comprehension tests such as the SAT.
2001, Vol. 130, No. 2, 208-223 Copyright 2001 by the American Psychological Association, Inc.0096-3445/01/$5.00 DOI: 10.1037//0096-3445.130.2.208
Using Working Memory Theory to Investigate the Construct Validity of
Multiple-Choice Reading Comprehension Tests Such as the SAT
Meredyth Daneman and Brenda Harmon
University of Toronto
When taking multiple-choice tests of reading comprehension such as the Scholastic Assessment Test
(SAT), test takers use a range of strategies that vary in the extent to which they emphasize reading the
questions versus reading the passages. Researchers have challenged the construct validity of these tests
because test takers can achieve better-than-chance performance even if they do not read the passages at
all. By using an individual-differences approach that compares the relative power of working memory
span to predict SAT performance for different test-taking strategies, the authors show that the SAT
appears to be tapping reading comprehension processes as long as test takers engage in at least some
reading of the passages themselves.
In this article, we show how working memory theory can be
used to address questions of interest to educational researchers. In
particular, we use the working memory approach as a way to
investigate the construct validity of the reading comprehension
portion of the revised Scholastic Assessment Test (SAT). There
has been a long and persistent history of attacks on the validity of
multiple-choice tests of reading comprehension such as the SAT
(see Anderson, Hiebert, Scott, & Wilkinson, 1985; Cohen, 1984;
Drum, Calfee, & Cook, 1981; Farr, Pritchard, & Smitten, 1990;
Katz, Blackburn, & Lautenschlager, 1991; Katz, Lautenschlager,
Blackburn, & Harris, 1990; Royer, 1990). One of the most serious
criticisms is that test takers do not or need not read and compre-
hend the passages on which the test questions are based. Indeed,
Katz et al. (1990) demonstrated that test takers were able to
perform better than chance on as many as 72% of the multiple-
choice items of the reading section of the SAT when they were not
given access to the passages. On the basis of findings such as this,
critics have argued that multiple-choice reading tests in general,
and the reading portion of the SAT in particular, may largely be
measuring factors unrelated to reading comprehension. This is a
serious allegation given the widespread practice of using SAT
scores in the screening and placement of college applicants in the
United States. We first briefly describe how working memory
theory has been applied to understanding educationally relevant
Meredyth Daneman and Brenda Hannon, Department of Psychology,
University of Toronto, Mississauga, Ontario, Canada.
This research was supported in part by a grant from the Natural Sciences
and Engineering Research Council of Canada.
We thank Gary Buck, Anne Connell, Irene Kostin, Tom Van Essen, and
the rest of the SAT team at the Educational Testing Service for allowing us
to use their SAT materials and for supplying us with their item difficulty
norms and details about official test administration; we thank Michelle Day
for help with task development and Candice Moore and Jayme Pickett for
help with data collection.
Correspondence concerning this article should be addressed to Meredyth
Daneman, Department of Psychology, University of Toronto, Mississauga,
Ontario L5L 1C6, Canada. Electronic mail may be sent to danemanfi
psych.utoronto.ca.
tasks. Then we describe how we have applied working memory
theory to investigating the construct validity of the reading portion
of the SAT.
Working Memory as a Predictor of Complex Cognition
There is already considerable evidence that working memory
theory can be applied to understanding performance on education-
ally relevant tasks. This is not surprising given that the construct of
working memory was proposed as an alternative to short-term
memory largely because of concerns about the ecological rele-
vance of the short-term memory construct (Baddeley & Hitch,
1974; Reitman, 1970). Prototypical models of short-term memory
(see, e.g., Atkinson & Shiffrin, 1968; Posner & Rossman, 1965)
assumed that short-term memory plays a crucial role in the per-
formance of ecologically relevant cognitive tasks such as language
comprehension, mental arithmetic, and reasoning, tasks that for
their solution require that individuals temporarily store informa-
tion and then operate on it. However, as soon as efforts were made
to test this intuitively appealing notion, it became evident that the
existing models of short-term memory were inadequate. Tradi-
tional measures of short-term memory such as word span and digit
span did not predict performance on complex cognitive tasks. So
the theory of short-term memory as a passive storage buffer was
replaced by the theory of working memory as a dynamic system
with processing and storage capabilities (see, e.g., Baddeley, 1986;
Baddeley & Hitch, 1974; Just & Carpenter, 1992). Word span and
digit span, measures that tap only passive short-term storage ca-
pacity or number of "slots," were replaced by reading span (Dane-
man & Carpenter, 1980) and operation span (Turner & Engle,
1989), measures that tap the combined processing and temporary
storage capacity of working memory during the performance of a
complex cognitive task.
There is now a substantial body of evidence that measures of the
combined processing and storage capacity of working memory
have lived up to their promise of doing a better job at predicting
performance on complex cognitive tasks than did the traditional
storage measures they replaced. Measures of working memory
capacity have been shown to predict performance on cognitive
208
activities as diverse as reading, listening, writing, solving verbal
and spatial reasoning problems, and programming a computer (see,
e.g., Baddeley, Logic, Nimmo-Smith, & Brereton, 1985; Benton,
Kraft, Glover, & Plake, 1984; Daneman & Carpenter, 1980, 1983;
Daneman & Green, 1986; Gathercole & Baddeley, 1993; Jurden,
1995; Kyllonen & Christal, 1990; Kyllonen & Stephens, 1990;
Masson & Miller, 1983; Shah & Miyake, 1996; Shute, 1991; for
reviews, see Daneman & Merikle, 1996; Engle, 1996). These
findings suggest that working memory plays a role in the perfor-
mance of a range of educationally relevant complex cognitive
tasks and that individuals with large working memory capacities
do better on these tasks than do individuals with smaller working
memory capacities. Indeed, the working memory approach has
been deemed so successful that a measure of the combined pro-
cessing and storage capacity of working memory has been in-
cluded in the latest edition of the Wechsler Adult Intelligence
Scale (WAIS-III; Wechsler, 1997), and digit span has been de-
moted to an optional subtest (see WAIS-III Technical Manual,
1997). Of particular relevance to the present study are the findings
that working memory capacity is a good predictor of performance
on tests of reading comprehension ability and tests of verbal
reasoning ability.
Consider for the moment the finding that measures of the
combined processing and storage capacity of working memory are
good predictors of performance on tests of reading comprehension
ability (Daneman & Merikle, 1996). Daneman and Merikle con-
ducted a meta-analysis of the literature investigating the associa-
tion between working memory capacity and different kinds of
language comprehension tasks. The meta-analysis included data
from 6,179 participants in 77 independent studies. On the predictor
task side, the meta-analysis included studies that used measures of
the combined processing and storage capacity of working memory,
such as reading span and operation span, as well as studies that
used the traditional span tests that tap predominantly storage
resources, such as word span and digit span. In a typical process
plus storage measure, individuals may be required to read and
judge the truth value of sets of unrelated sentences (e.g., "Mam-
mals are vertebrates that give birth to live young," "March is the
first month in the year that has thirty-one days," "You can trace the
languages English and German back to the same roots") and then
to recall the final words of each sentence in the set (e.g., young,
days, roots; see Daneman & Carpenter, 1980). Or they may be
required to verify the stated solutions to simple arithmetic prob-
lems (e.g., "(2 X 3) - 2 = 4 tree," "(6/3) + 2 = 8 drink," "(4 X
2) 5 = 3 chain") and then to recall the stated solutions (e.g., 4, 8,
3; see Turner & Engle, 1989) or the accompanying words (e.g.,
tree, drink, chain; see Turner & Engle, 1989). In a traditional
storage span measure, individuals simply have to store and retrieve
a string of random words (e.g., cup, shoe, ball) or digits (e.g., 8, 6,
1). On the criterion task side, the meta-analysis included studies
that assessed language skill with global or standardized tests of
comprehension and vocabulary knowledge and with specific tests
of integration. The most common global or standardized tests were
the verbal component of the SAT and the Nelson-Denny Reading
Test. Specific tests of integration included tests that assessed
people’s ability to compute the referent for a pronoun, to make
inferences, to monitor and revise inconsistencies, to acquire new
word meanings from contextual cues, to abstract the main theme,
and so on (see Daneman & Merikle, 1996).
The results of Daneman and Merikle’s (1996) meta-analysis
showed that verbal process plus storage measures such as reading
span were the best predictors of comprehension, correlating .41
and .52 with global and specific tests of comprehension, respec-
tively.1 However, math process plus storage measures such as
operation span were also significant predictors, correlating .30 and
.48 with global and specific tests of comprehension, respectively,2
a finding that suggests that it is an individual’s efficiency at
executing a variety of symbolic processes, and not simply sentence
comprehension processes, that is related to comprehension ability.
In addition, both the verbal and the math process plus storage
measures were better predictors of comprehension than their sim-
ple word span and digit span counterparts,3 a finding that suggests
that it is the combined processing and temporary storage capacity
of working memory, and not simply the temporary storage capac-
ity, that is important for comprehension.
All in all, the correlational evidence suggests that the capacity to
simultaneously process and store symbolic4 information in work-
ing memory is an important component of success at comprehen-
sion. Moreover, this working memory capacity seems to be a
sensitive predictor of individual differences in performance on
global tests of comprehension that use a multiple-choice format
(such as the SAT) and on specific tests of comprehension that
assess comprehension by means of a variety of other non-multiple-
choice formats (such as having test takers generate answers to
specific questions or summarize the main theme). According to the
theory, working memory span is a good predictor of comprehen-
sion because individuals who have less capacity to simultaneously
process and store verbal information in working memory are at a
disadvantage when it comes to integrating successively encoun-
tered ideas in a text as they have less capacity to keep the earlier
read relevant information still active in working memory.
Construct Validity and the Reading Portion of the SAT
Does the reading comprehension portion of the SAT measure
what it was designed to measure, namely, passage comprehension?
Critics such as Katz et al. (1990) have argued no because test
takers could be using a strategy that bears little if any relation to
what the test was designed to measure; this is the strategy of
selecting answers to questions without even reading (let alone
comprehending) the passages on which the questions were based.
1
The 95% confidence intervals (CIs) were .38 to .44 for global tests of
comprehension and .49 to .55 for specific tests of comprehension.
2
The CIs were .25 to .35 for global tests of comprehension and .43 to .53
for specific tests of comprehension.
3
Word span correlated .28 (CI = .23-33) and .40 (CI = .34-46) with
the global and specific tests of comprehension, respectively, and digit span
correlated .14 (CI = .10-.18) and .30 (CI = .25-35) with the global and
specific tests of comprehension, respectively.
4
There is some evidence to suggest that when it comes to predicting
comprehension skills, the predictive power of the process plus storage
measures of working memory are limited to measures that tap symbolic
processes (e.g., words, sentences, digits). Daneman and Merikle’s (1996)
meta-analysis investigated only the predictive power of verbal and math
process plus storage measures. A number of recent studies have shown that
spatial process plus storage measures do not predict comprehension ability
(see, e.g., Shah & Miyake, 1996).
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime


