Page 1
CHARACTERISTICS OF DISSOCIABLE HU...
BEHAVIORAL AND BRAIN SCIENCES (1994) 17, 367-447 Printed in the United States of America Characteristics of dissociable human learning systems David R. Shanks Department of Psychology, University College London, London WC1E 6BT, England Electronic mall: david.shanks@psychol.ucl.ac.uk Mark F. St. John Department of Cognitive Science, University of California at San Diego, La Jolla, CA 92093 Electronic mail: mstfohn@cogsci.ucsd.edu Abstract: A number ofways of taxonomizing human learning have been proposed. We examine the evidence for one such proposal, namely, that there exist independent explicit and implicit learning systems. This combines two further distinctions, (1) between learning that takes place with versus without concurrent awareness, and (2) between learning that involves the encoding of instances (or fragments) versus the induction ofabstract rules or hypotheses. Implicit learning is assumed to involve unconscious rule learning. We examine the evidence for implicit learning derived from subliminal learning, conditioning, artificial grammar learning, instrumental learning, and reaction times in sequence learning. We conclude that unconscious learning has not been satisfactorily established in any of these areas. The assumption that learning in some of these tasks (e.g., artificial grammar learning) is predominantly based on rule abstraction is questionable. When subjects cannot report the "implicitly learned" rules that govern stimulus selection, this is often because their knowledge consists ofinstances or fragments ofthe training stimuli rather than rules. In contrast to the distinction between conscious and unconscious learning, the distinction between instance and rule learning is a sound and meaningful way of taxonomizing human learning. We discuss various computational models of these two forms of learning. Keywords: artificial grammar categorization connectionism consciousness explicit/implicit processes instances learning mem- ory rules 1. Introduction A classic issue faced by researchers attempting to under- stand the basic laws of learning is whether there is more than one basic learning mechanism. Can all the phenom- ena of learning be accommodated by a unitary mecha- nism, or do we need to posit the existence of independent and dissociable human learning systems? In this target article we consider some of the experimental evidence - much of it very recent - that has addressed this issue. We will consider two dimensions on which it has been suggested that functionally distinct learning systems dif- fer. The first dimension concerns the role of awareness during learning. Many authors (e.g., Hayes & Broadbent 1988 Lewicki et al. 1987 Reber 1989a) have argued that in addition to having a learning system whose functioning is accompanied by concurrent awareness ofwhat is being learned, humans have a quite separate system that oper- ates independently ofawareness. The second dimension, which turns out to be closely related to the first, concerns the content oflearning. Distinct learning systems encode very different sorts of information one system induces rules (e.g., Lea & Simon 1979 Nosofsky et al. 1989), whereas a second system memorizes instances (e.g., Brooks 1978 Medin & Schaffer 1978). We believe it is important to evaluate the current evidence for and against the multiple-systems view for at least two reasons. First, each ofthe separate systems that has been hypothesized has tended to encourage re- searchers to develop a set of explanatory constructs that are unique to that system and that allow its characteristic phenomena to be explained. A drawback, however, is that experimental results are often interpreted exclusively in terms of these restricted concepts, with no consideration of whether they might also be understood (and possibly better understood) in terms of more general principles. The second and perhaps more pressing reason for evaluating the evidence for dissociable learning systems is that there has been considerable interest, over the last few years, in whether there exist dissociable memory systems (for reviews, see Richardson-Klavehn & Bjork 1988 Schacter 1987 1989 Squire 1992). The mounting positive evidence comes from a variety of sources. For example, amnesic patients have been shown to be dra- matically impaired on certain direct tests ofmemory, such as free recall, but less impaired or even unimpaired on indirect tests of memory, such as motor skills (see Squire 1992). Although dissociations between performance on direct and indirect tests do not force us to conclude that there are dissociable memory systems (e.g., Jacoby & Kelley 1991 Roediger 1990), some researchers have ar- gued at length that the experimental results, together with current understanding ofbrain functioning, strongly imply the existence of separable underlying systems (e.g., Schacter 1989 Squire 1992). Few would argue that learning and memory can be �� 7994 Cambridge University Press 0140-525X194 S5.00+.00 367
Page 2
Shanks & St. John: Dissociable learning systems studied independently. On the contrary, the possible characteristics of dissociable learning systems should be considered in research on the issue ofdissociable memory systems and vice versa. Indeed, if there really are disso- ciable memory systems, it seems very likely that there are also dissociable learning systems that supply them with information. Yet, as several authors have noted (e.g., Berry & Dienes 1991 Reber 1989a), research on learning and on memory has tended to proceed independently. We hope to help memory researchers in their attempts to understand information storage and retrieval by examin- ing carefully the question of whether distinct learning systems exist and by analyzing the properties of the learning mechanisms that acquire information. 1.1. Proposed distinctions between types of learning Distinctions between different types of learning have been common in psychology for many years. One such distinction is between declarative and procedural learn- ing, that is, between the acquisition offactual knowledge and the acquisition of skills, respectively (e.g., Cohen & Squire 1980 Morris 1984 Winograd 1975). Other distinc- tions include the acquisition of "habits" versus "memo- ries" (Mishkin et al. 1984) and "taxon" versus "locale" learning (O'Keefe & Nadel 1978). Of course, if indepen- dent memory systems require independent learning mechanisms, then many more distinctions might be needed. For instance, we might require separate learning systems to feed semantic and episodic memory stores (Neely 1989 Tulving 1983 see also multiple book review: BBS 7(2) 1984). Of these distinctions, the one between declarative and procedural learning has probably attracted the most at- tention, with a variety of empirical phenomena being interpreted in that framework. For example, Cohen and Squire (1980) suggested that amnesics have normal or near-normal procedural learning but impaired declarative learning, a theoretical notion that has been widely taken up by other researchers in the amnesia field. This distinc- tion has in recent years been largely eclipsed, however, by the alternative distinction between "explicit" and "im- plicit" learning. (Note that some authors have replaced the original declarative/procedural distinction with the terms "declarative" and "nondeclarative" [e.g., Shim- amura & Squire 1989 Squire 1992].) The main reason for the shift in terminology and emphasis toward the terms "explicit" and "implicit" is dissatisfaction with the original terminology, the term "procedural" apparently being too narrow to encompass the relevant learning effects. For example, the learning that is preserved in amnesia is not always of a procedural nature: it includes a variety of priming effects involving, for instance, the ability to complete word steins (Graf et al. 1984) and an increase in the likelihood ofjudging a nonfamous name famous as a result of prior exposure (e.g., Squire & McKee 1992). The term "implicit learning" was first coined by Reber (1967), who is responsible for much of the recent interest in the issue of distinct learning systems (see Reber 1989a for a review). Different authors have used a variety of definitions to capture the fine detail of the explicit/ implicit learning distinction (see Mathews et al. 1989, for examples), but the key factor is the idea that implicit learning occurs without concurrent awareness of what is being learned and represents a separate system from the one that operates in more typical learning situations, where learning does proceed with concurrent awareness (i.e., explicitly). At the same time, it is clear that many authors have been concerned with the possibility that different learning tasks might give rise to different kinds of knowledge (e.g., Mathews et al. 1989 Reber 1989a Vokey & Brooks 1992), one kind abstract or rule based and the other based on separate fragments or instances. For Reber, implicit learning is not only unconscious but also' involves the acquisition of abstract information. The paradigm case is language learning, where people are assumed to be able implicitly to learn abstract gram- matical rules. Few nonlinguists are aware ofor are able to articulate the grammatical rules supposed to underlie their linguistic performance, so it makes sense to imagine that those rules are acquired, if at all, without ever being directly represented in consciousness. The rules are ab- stract in the sense that they apply equally to any linguistic tokens, including novel ones, that come from the appro- priate syntactic categories. Because the aware/unaware and rules/instances di- mensions are logically distinct, we believe that they must be treated independently, in this target article we accord- ingly review evidence for these two dimensions sepa- rately. In whatfollows,we reserve the term "unconscious learning" for learning without awareness, regardless of what sort of knowledge is being acquired. At the same time, we use the terms "rule learning" and "instance learning" to refer to the acquisition of abstract and frag- mentary knowledge, respectively, regardless of whether such learning is conscious. Most of the article is devoted to whether unconscious learning is indeed supported by empirical evidence. In section 2 we survey a wide range of learning paradigms, from subliminal learning phenomena to Pavlovian condi- tioning to artificial grammar learning and serial reaction- time tasks. The stimuli and specific processes involved in performing and learning each of these tasks differ widely and may share some basic characteristics or may exhibit some basic differences. Across these diverse paradigms we find little actual support for unconscious rule induc- tion (i.e., for implicit learning), or for the unconscious learning of any other type of information. However, in section 3 we dofindevidence for a dissociation between a rule-induction system and an instance-memorization sys- tem we review evidence for this dissociation obtained in explicit, or conscious, learning tasks. Within each system, the range of different processes and information is still large, but they nevertheless seem to form two distinct types: slow, effortful hypothesis testing on the one hand, and fast, efficient memorization of instances and frag- ments of instances on the other. We concentrate throughout on data from normal sub- jects. It is clear, however, that amnesic patients have learning difficulties, and these difficulties have been widely interpreted within the explicit/implicit framework (e.g., Squire 1992). For our present purposes, the data from such subjects are tangential, because the question of awareness during learning has not been directly consid- ered in amnesics (but see Knopman 1991). In section 4 we comment briefly on the interpretation of learning data from this population of subjects. 368 BEHAVIORAL AND BRAIN SCIENCES (1994) 17:3
Page 3
Shanks & St. John: Dissociable learning systems 2. Can learning occur without awareness? Proponents of the explicit/implicit distinction have ar- gued that there are clear demonstrations of subjects' ability to encode new information without being aware of that information, and hence that awareness is the key dimension on which separable learning systems differ. The question of whether learning can occur without awareness goes back many decades (e.g., Adams 1957 Dulany 1961 Eriksen 1960 Krasner 1958 Thorndike & Rock 1934). In addition to the recent work of Reber, which we consider below, in the last five or six years there have been a large number of sequence learning reaction time studies that have adopted an interesting and novel tech- nique for assessing the relationship between aware- ness and learning. A substantial part of our review concerns results obtained using this task. We also con- sider evidence from a variety of conditioning proce- dures. We begin with some comments on experimental methodology. 2.1. The logic of dissociations Almost all studies of unconscious learning have adopted a very constrained version of the logic of dissociation. Separate indices of learning and awareness are used in the attempt to find circumstances in which exposure to a set of stimuli leads to detectable learning unaccompanied by any reliable degree of awareness. On the face of it, such an approach could lead to unequivocal evidence of uncon- scious learning, but researchers using similar logic to try to establish the existence of unconscious perception have noted several problems (e.g., Reingold & Merikle 1988). What counts as a suitable test of awareness? Can we discount the possibility that our index of awareness is contaminated by unconscious information? Can we be sure it is sufficiently sensitive to detect exhaustively all conscious information? As we shall see, these are deep problems, and researchers have adopted a variety of strategies to try to circumvent them. Firmer evidence for unconscious learning may emerge from experiments based on alternatives to this particular dissociation paradigm. To test unconscious perception, for example, Reingold and Merikle (1988) have proposed a new and interesting procedure, whereby one looks for greater sensitivity to some variable in an indirect test in which instructions make no reference to the variable as compared with an otherwise identical direct test in which the instructions do refer to the variable. Alternatively, one could try to demonstrate the independence of two learning systems by trying to establish qualitative differ- ences between them (e.g., Merikle & Reingold 1992), such that, for example, one system is affected in one way by a variable, the other in the opposite way. We know of only one study that has even come close to establishing such qualitative differences this case will accordingly be considered in some detail. Hayes and Broadbent (1988) began by postulating two independent systems: an unconscious system that would slowly accu- mulate information about predictive events in the envi- ronment and a conscious system that would test hypoth- eses. They further assumed that the conscious system would be highly dependent on a limited-capacity working memory system, and the unconscious system would be independent of it. [See also Broadbent: "The Maltese Cross" BBS 7(1) 1984.] A rather straightforward prediction emerges from this plausible model of the cognitive system. Because the conscious learning mechanism relies on working memory, there should be situations where learning is profoundly affected by loading the working memory system with a secondary task, such as generating random numbers. At the same time, because the unconscious system does not depend on working memory, other (implicit) learning tasks should be unaffected by such a secondary task. Indeed, Hayes and Broadbent went so far as to say that unconscious learning might be facilitated by a secondary task if it prevented the conscious system from exerting an interfering influence on the unconscious system. The importance of the Hayes and Broadbent study is that, in accordance with their model, they appeared to have found two learning tasks that differed in only a minor way, one of which was inhibited and the other facilitated by a secondary task. In their experiments Hayes and Broadbent contrasted performance in two versions of the computer "person" task. On each trial, the subject entered an attitude (e.g., polite) into the computer, which then responded with its attitude (e.g., unfriendly). The subject's task was to try to get the computer to be friendly. If we designate the 12 possible attitudes - going from very unfriendly to loving - with the numbers 1 . . . 12, then the computer's attitude on each trial was a simple numerical function of the subject's input. In one (No-Lag) condition, the computer's attitude (O() on each trial was a function of the subject's attitude (I,) on the same trial: O, = I, - 2 + r (1) where r is a random number (���1, 0, or l)and the attitudes have the 12 numerical values mentioned above. In the other (Lag) condition, I, was replaced by l,_u so that the computer's attitude was determined by the subject's atti- tude on the preceding trial: O, = I,_ - 2 + r (2) Performance was measured in terms of the number of trials in which the subject's input was one that could (given the random element) have produced a friendly response from the computer person. Although learning occurred in both groups, Hayes and Broadbent found that subjects could give highly accurate verbal reports about the No-Lag task, indicating that their learning had been accompanied by awareness, whereas the verbal reports of subjects in the Lag version were very poor. This result encourages the view that learning in the No-Lag task can be readily achieved by the explicit system, but that the Lag task requires the implicit system. Thus we might predict that a concurrent secondary task would have an effect on learning in the No-Lag condition but not in the Lag condition. To test this, Hayes and Broadbent (1988) gave subjects a block of learning trials using either Equation 1 (No-Lag group) or Equation 2 (Lag group). After 30 trials in the No- Lag condition and 50 trials in the Lag condition, perfor- mance was approximately equated, and at this point Hayes and Broadbent changed the rules by replacing the - 2 in the equations with +2. They then presented a further 30 (No-Lag group) or 50 (Lag group) relearning BEHAVIORAL AND BRAIN SCIENCES (1994) 17:3 369
Page 4
Shanks & St. John: Dissociable learning systems trials. Under single-task conditions (Experiment 1), per- formance in the Lag condition was affected more detri- mentally than in the No-Lag condition by this rule change. In contrast, when subjects were required to perform a concurrent secondary task (generating random letters or digits Experiments 2 and 3), a change in the rule interfered more with performance in the No-Lag than in the Lag task, exactly the opposite of the result obtained when there was no secondary task. The results conform to Hayes and Broadbent's theory - and hence to their conception ofseparate implicit and explicit learning systems - if we simply assume that the secondary task occupied the conscious working memory system and therefore interfered with the explicit system, whereas removal of the working memory system allowed the im- plicit system to operate without any interfering influence from the explicit system. Unfortunately, Green and Shanks (1993) were unable to replicate Hayes and Broadbent's results. In the single- task groups, Green and Shanks found that the introduc- tion of the equation change had similar effects on perfor- mance in the No-Lag and Lag groups, thus failing to replicate Hayes and Broadbent's (Experiment 1) finding that performance was more detrimentally affected in the Lag condition. Under dual-task conditions the situation was the same: performance was approximately equally affected in the two groups. There was not the slightest hint that performance in the Lag group was less affected by the equation change, and hence Hayes and Broad- bent's (Experiment 2 and 3) dual-task results were like- wise not replicated. Green and Shanks suggest that Hayes and Broadbent may have obtained the results they did owing to the inappropriate inclusion of subjects who had learned very little prior to the equation change. Hayes and Broadbent's dissociation posed a genuine problem for theories oflearning relying on a single learn- ing mechanism. Because the secondary task appeared to have opposing effects on the two primary tasks, Hayes and Broadbent's data seemed to support the claim that there exist dissociable learning systems. Obviously, the fact that their results could not be replicated undermines those conclusions. With the exception of Hayes and Broadbent's study, implicit learning experiments have universally adopted the dissociation logic ofattempting to demonstrate learn- ing in the absence ofany detectable degree ofawareness. As we shall see, various methodological problems with the dissociation procedure make it doubtful whether unconscious learning has yet been established. It is worth bearing in mind, however, that future experiments using alternative methods may license stronger inferences con- cerning the dissociability of learning systems. We now begin our discussion of the empirical evidence. 2.2. Unconscious learning with subliminal stimuli Most studies ofunconscious learning have asked whether people can learn about relationships between stimuli without being aware of those relationships, but before discussing the results of such studies we will briefly consider evidence from experiments asking a more direct question, Can people learn about stimuli when they are unaware of the existence of these stimuli, that is, when the stimuli are subliminal? A situation in which uncon- scious learning would, on the face of it, be fairly straight- forward to establish is one in which a subject is entirely unaware that the critical stimulus in the learning phase is present at all, yet still shows evidence of leaning some- thing about that stimulus. There have, of course, been a large number of experi- ments in which subjects are presented with brief or low- intensity stimuli intended to be below the threshold of awareness and in which an attempt is made to measure effects of such stimulation on subsequent behavior. We ignore much of this literature, for two reasons: first, in some cases such effects may be only tenuously related to learning. For example, many subliminal activation exper- iments ask whether the way a stimulus is interpreted may be biased by a supposedly subliminal stimulus presented a few hundred milliseconds previously ��� (e.g., Marcel 1983). It is doubtful, .however, that such biasing effects would occur over longer intervals: instead, they are typ- ically interpreted as examples of some sort of short-lived facilitation. Needless to say, it is difficult to draw a sharp line between perception and learning, but if unconscious learning is to have any real significance, it must be demonstrable over reasonable intervals of time (at the very least seconds or minutes rather than milliseconds). Second, many subliminal activation experiments that do appear to show longer-lasting effects (e.g., Eich 1984) have already been the subject ofextensive criticism in this journal (see Holender 1986, and accompanying commentaries). We have no wish to repeat arguments made previously except to point out that in such experi- ments it is extremely difficult to be confident that the stimuli are indeed below the threshold of conscious per- ception. We accordingly focus in this section on studies that avoid these problems. Andrade (in press), Bornstein (1992), Ghoneim and Block (1992), Greenwald (1992), and Schacter (1987) review a number of relevant studies examining learning with subliminal stimuli. Although there have been some positive results, a corresponding number of negative findings leads us to suggest that unconscious learning with subliminal stimuli has not yet been conclusively demonstrated. Subliminal stimuli may be presented to awake subjects as auditory messages at extremely low intensity or in some scrambled form, or as images presented for very brief durations or embedded in other figures alter- natively, they may be presented to subjects during sleep or anesthesia. There is a widespread popular belief in the ability of such subliminal messages to condition attitudes or preferences or otherwise to influence behavior. In- deed, this belief is so powerful that the families of two young men who died from self-inflicted gunshot wounds sought more than $6 million in damages from the rock group Judas Priest on the grounds that subliminal mes- sages on one ofthe group's records had caused the men to commit suicide (see Loftus & Klinger 1992). Recent investigations, however, suggest that the concern is mis- placed. Controlled experiments attempting to see whether subliminal messages can influence behavior or whether people can use self-help audiotapes as learning aids have yielded exclusively negative results (British Psychological Society 1992 Greenwald et al. 1991 Vokey & Read 1985). It seems unlikely that unconscious learning can occur in such situations. 370 BEHAVIORAL AND BRAIN SCIENCES (1994) 17:3
Page 5
Shanks & St. John: Dissociable learning systems Several investigations of spared cognitive functions under general anesthesia have obtained evidence ofsmall but reliable amounts of learning, but these are matched by a comparable number of negative results (see An- drade, in press Ghoneim & Block, 1992, for reviews). If the anesthetic has been adequately administered and renders the patient entirely unconscious, then spared learning must in turn be unconscious. A typical positive result was reported by Jelicic et al. (1992). They gave anesthetized patients repeated auditory presentations of two words (e.g., yellow, green) from a semantic category. Later, when the anesthetic had worn off, subjects were asked in a priming test to generate members of those categories. Subjects were significantly more likely to produce the preexposed words than were control subjects who had not been read the words during anesthesia. Thus some information does seem to have been encoded while the subjects were unconscious. Another positive result was reported by Kihlstrom et al. (1990). They gave anesthetized patients lists ofstrongly associated cue-target word pairs, with each list being presented about 67 times during the operation. Later, when the anesthetic had worn off, subjects were given a cued recall and a recognition memory test in a third test, they were read the cue words and had to say the first word that came to mind. Although the recall and recognition tests yielded no evidence of retention, on the generation test subjects were more likely to produce target items to preexposed cue words than to nonpreexposed cue words, whether the test was relatively soon after the exposure phase (median 87 min) or much later (median 14 days). Thus, again, some degree of unconscious registration seems to have occurred. In contrast to this are the many negative results that have been published. Some of these are particularly revealing because they come from experiments using procedures very similar to those ofstudies that have found positive results. For example, Cork et al. (1992) failed to replicate the Kihlstrom et al. (1990) results using a differ- ent anesthetic but otherwise identical procedures. Fur- thermore, despite the likelihood that sleep renders a person less unconscious than general anesthesia, in a well-controlled experiment Wood et al. (1992) were un- able to obtain evidence of learning during sleep, again with procedures similar to those used in the Kihlstrom et al. (1990) study. Similarly, Ghoneim et al. (1992) found no evidence of Pavlovian conditioning in anesthetized pa- tients they used experimental procedures that did reveal conditioning in nonanesthetized subjects. This pattern ofresults might simply indicate that learn- ing under anesthesia is a genuine phenomenon, but that relatively subtle methodological factors determine whether a given study will or will not obtain evidence of it. However, Andrade (in press) discusses a large number ofstudies, including over 20 published reports offailures, and is unable to find any clear factors that determine whether learning will or will not occur. For example, it does not seem to be especially related to the type of stimuli used. More significantly, it remains an open possi- bility that many positive results have been due to inade- quately administered anesthetic that left some or all ofthe patients at least partially conscious. It is worth noting that in the Cork et al. (1992) study three subjects were ex- cluded from the analysis because they had explicit mem- ory of the study items! As Cork et al. say, "the extent to which implicit expressions of memory are affected by general anesthesia remains uncertain" (p. 897). 2.2.1. Conclusions. Experiments in which subjects are presented with stimuli that they are likely to be unaware of at the time of exposure yield some evidence of uncon- scious learning, but this is offset by a substantial body of negative evidence. At present, it would be premature to conclude from the available studies that unconscious learning is feasible. 2.3. Criteria for establishing unconscious learning with supraliminal stimuli In the rest ofthis section we focus on situations where the stimuli are above the threshold for detection and identi- fication. In such situations, subjects may be unaware of the relationships between stimuli even though they are aware of the stimuli themselves. Learning of inter- stimulus relationships may therefore be unconscious. We argue that just about all unconscious learning ex- periments with supraliminal stimuli can be conceptually reduced to the arrangement shown in Figure 1. The figure illustrates an associative learning episode in which subjects have the opportunity to learn that two events, A and B, stand in a predictive relationship. Event A might be a tone conditioned stimulus (CS) and event B a shock unconditioned stimulus (US) the measure of learning might be a galvanic skin response (GSR) at time t2 when the CS is presented again. Or event A might be a feature or set of features, event B might be a category, and the measure of learning might be the probability of making the category response at t2- We are interested in whether subjects can learn the predictive relationship in the ab- sence of concurrent awareness of that relationship. We assume for the sake of simplicity that there is just one learning trial. Learning itself presumably takes place during or after presentation ofevent B we wish to ascertain the subject's state of awareness during this learning episode. Unfor- tunately, there are likely to be profound technical diffi- culties in assessing awareness ofa predictive relationship at just the moment learning itself occurs. Apart from anything else, asking subjects at time tt whether they are aware ofthe relationship between stimuli A and B is likely to direct their attention to that relationship. As an illustra- tion, in a study by Baeyens et al. (1990a) that will be discussed in more detail later, the proportion of A-B relationships which the subjects appeared to be aware of on a postconditioning recognition test increased from 18% to 77% when subjects also gave concurrent estimates of awareness during the learning stage. Clearly, the con- current index ofawareness directed subjects' attention to the relationship and affected the very entity it was de- signed to measure. Hence, we will usually have to settle for assessing awareness after the target learning trial. At this time (t2 in Fig. 1), suppose we present event A (a tone previously paired with shock) and measure the GSR as well as asking subjects whether they have any particular expectancy of event B. Ifwe obtain a GSR but no evidence ofa conscious expectancy of event B, we have obtained the crucial finding that lies at the heart ofall attempts to demonstrate BEHAVIORAL AND BRAIN SCIENCES (1994) 17:3 371
Page 6
Shanks & St. John: Dissociable learning systems B Implicit measure e.g., GSR, classification response, etc. Learning Time Measure of conscious knowledge reveals: (i) Awareness of study episode (ii) Awareness of A-B relationship (iii) No relevant awareness Figure 1. Schematic illustration of events in experiments that investigate the role of awareness in the learning of predictive relationships. Subjects witness a predictive relationship between stimuli A and B, with learning presumed to occur during the interval marked tx. At some time later (f2) stimulus A is presented again. Performance at t2 is taken as an index of learning at tit whereas a concurrent measure of awareness at t2 is used to infer the content of the subject's awareness at tv implicit learning with supraliminal stimuli. For if subjects have no expectancy ofevent B at t2, we have some basis for inferring that they were not aware ofthe A-B relationship at tv This might seem to be a very strong inference, but we believe such inferences will have to be accepted ifuncon- scious learning is to be established. It is unavoidably dif- ficult to assess awareness concurrently with learning, so one isforcedto rely on some later test. Ofcourse, we also make a backward inference concerning learning itself: if performance at time t2 is no better than we would expect by chance, we often infer that learning did not occur at tt. Conversely, if performance is better at t2 than we would expect by chance, we conclude that learning did occur. 2.3.1. The relationship between unconscious learning and implicit memory. The basic design shown in Figure 1 allows us to see the intimate relationship between uncon- scious learning and implicit retrieval: demonstrations of unconscious learning are a proper subset ofthe larger set of demonstrations of implicit retrieval. Implicit retrieval is defined as occurring when informa- tion from some prior episode can be retrieved and can hence influence current processing, but in the absence of conscious recollection ofthat priorepisode (e.g., Schacter 1987 we use the term "implicit retrieval" rather than the more common term "implicit memory" to emphasize that we are specifically considering what happens during the retrieval process). Thus, implicit retrieval requires the absence ofa conscious reexperience ofthe study episode. Now, lack of awareness of a contingency at t2 presumably means the absence of any consciously recallable episodic memory traces in which that contingency is embedded, and hence any piece of evidence that allows us to infer unconscious learning must also be an example of implicit retrieval: this is case (iii) shown in Figure 1. The converse does not hold, however an example of implicit retrieval does not necessarily represent evidence of unconscious learning. Suppose that a subject emits a GSR when presented at test with a tone stimulus. There are three possible sce- narios, shown in Figure 1: 1. The subject remembers the study episode, in which case the GSR response does not count as an example of implicit retrieval according to Schacter's (1987) definition. Because remembering the episode entails remembering the content of that episode (i.e., the A-B contingency), the learning could not have been implicit either. 2. The subject does not remember the study episode, but is aware - that is, has semantic knowledge - that this tone predicts shock (cf. source amnesia). Although this qualifies as a case of implicit retrieval, we would not infer that learning itself had been unconscious, since at t2 the subject is aware that A predicts B. (Note that this ignores the possibility that subjects could have been unaware of the A-B relationship at tx, but aware of it at t2, for example, as a result of observing their own behavior. Observation of a GSR in response to the tone might lead the subject to believe that the tone must therefore predict shock. How one might exclude this possibility is a difficult question.) 3. The subject neither remembers the study episode nor has conscious semantic knowledge of the A-B rela- tionship. This final case again qualifies as implicit re- trieval. More important, we now have evidence that is relevant to unconscious learning, as lack of awareness of the relationship at retrieval licenses the inference that learning too took place without awareness. Thus, in orderforus to infer unconscious learning from implicit retrieval, the subject must be unaware of the relevant relationship that occurred in the study episode, in addition to being unaware of the episode itself. In summary, an unconscious learning experiment just is an implicit retrieval experiment, but with the added compo- nent of meeting this further condition. For researchers in the field of implicit retrieval, all that is of interest is whether the subject is unaware of the relevant study episode, as in cases (ii) and (iii). But only case (iii) is relevant to the question of unconscious learning the subject must also be unaware of the relationship that 372 BEHAVIORAL AND BRAIN SCIENCES (1994) 17:3
Page 7
Shanks & St. John: Dissociable learning systems occurred in that episode. It is for this reason, we argue, that much of the data obtained from amnesics is irrelevant to the question of unconscious learning (see sect. 4). 2.3.2. Dissociation of task performance and verbal re- ports. Within the dissociation paradigm (Reingold & Merikle 1988), many studies have shown that subjects can acquire information without being able to report it ver- bally at a later time. Such findings have been taken as support of the claim of unconscious learning. Suppose that subjects are presented with some information at time tj and that a subsequent performance test indicates they have encoded this information. We argue that if the aim is to establish what the subjects' state of awareness was at tu examining the content of their verbal reports at t2 is certainly not the only way to do this and may not be the best one. To illustrate this, note that the condition mentioned above (that the backwards inference must be valid) can be made more specific by dividing it into two further criteria. The first concerns the match between the information responsible for performance changes and the information revealed by the test of awareness. We call this the Infor- mation Criterion. The second criterion concerns the sensitivity of the test for awareness. We call this the Sensitivity Criterion. Information Criterion: Before concluding that subjects are unaware of the information that they have learned and that is influencing their behavior, it must be pos- sible to establish that the information the experimenter is looking for in the awareness test is indeed the information responsible for performance changes. This criterion is intended to exclude situations such as the following: suppose the experimenter sets up a task in which performance can be improved if the subjects learn information 7. Performance does indeed improve, and subjects are apparently unaware at time t2 that they have learned /. However, an adequate explanation of the im- provement in performance is that subjects are not learn- ing /, but 7*. By the experimenter's criteria, awareness of 7* would be disregarded as irrelevant, and so the experi- menter would erroneously conclude that the subjects' performance was under the control of some information or knowledge of which they were unaware. The Informa- tion Criterion is closely related to the notion of "corre- lated hypotheses" introduced by Adams (1957) and Du- lany (1961) and which will be discussed in section 2.6.1. Our second criterion is far from new (e.g., Brewer 1974 Brody 1989 Dawson & Schell 1985 Ericsson & Simon 1980 1984 Eriksen 1960 Reingold & Merikle 1988). It is simply that tests of unconscious learning must achieve an adequate level of sensitivity: Sensitivity Criterion: To show that two dependent variables (in this case, tests of conscious knowledge and task performance) relate to dissociable underlying sys- tems, we must be able to show that our test of aware- ness is sensitive to all of the relevant conscious knowl- edge. Unless this criterion is met, the fact that subjects are able to transmit more information in their task performance than in a test of awareness may simply be due to the greater sensitivity of the performance test to whatever conscious information the subject has encoded. Let us take as our null hypothesis the claim that there is a single source of conscious knowledge that can manifest itself on both the performance and the awareness test. If perfor- mance is above chance, but there is no detectable aware- ness, an immediate inference is that our test of awareness is simply less sensitive than the performance test to the available resource of conscious information. Or, to put it another way, there is conscious knowledge that is not being detected by the supposed test of awareness but is contributing to task performance. To rule out this possibility, we must have either (1) some independent reason to believe that the test ofawareness is sensitive to all of the potentially relevant conscious infor- mation, or (2) some reason to believe that the awareness test is at least as sensitive as the performance test in terms of its ability to detect relevant conscious information. The first of these requires demonstrating that the awareness test is exhaustive, something that Reingold and Merikle (1988) have noted is likely to be very difficult to do. In contrast, the second requirement can be met if we try to make the performance and awareness tests as similar as possible in terms of retrieval context, differing only in terms of task instructions. If the instructions in the aware- ness test encourage the subject to retrieve as much conscious information as possible, and if the retrieval contexts in the two tests are approximately matched, then the Sensitivity Criterion may be met, because it is un- likely that the performance test would elicit the retrieval of more conscious information than the awareness test when the latter has provided subjects with a stronger motivation to do so. If we still obtain a dissociation between performance and awareness under such circum- stances, we will have good evidence of unconscious learning.x As an illustration of the application of these criteria, consider a widely cited implicit learning study by Lewicki et al. (1987). In the first phase, each trial consisted of the presentation of a target item in one of the four quadrants of a computer screen (which, for purposes of discussion, we can designate as A, B, C, and D) the subjects' task was simply to press a button corresponding to that quadrant as quickly as possible. The basic idea of these experiments can be simply stated: the choice of target location on each trial was nonrandom, and the question was whether the subjects would be able to detect this nonrandom ness. Subjects were presented with sequences of seven tri- als, with rules constructed so that target locations on the seventh trial could be predicted from its locations on trials 1, 3, 4, and 6. On each of the first six trials, the digit 6 appeared on its own in one of the quadrants of the screen, but on trial 7 (the "complex" trial), it was embedded in a display containing 36 digits. Reaction time on the seventh trial was the measure of interest. Again, the rules specify- ing target location were deterministic: thus, if the target appeared in locations C, A, D, and B on trials 1,3, 4, and 6 respectively, then on trial 7 the target would be in loca- tion A. In common with many other such results (which will be reviewed in sect. 2.7 below), Lewicki etal. (1987, Experi- ment 1) found that reaction times (RTs) on the target trials decreased significantly across 4,608 complex trials. In addition, RTs increased significantly when, toward the end of the experiment, the rules were changed so that on BEHAVIORAL AND BRAIN SCIENCES (1994) 17:3 373
Page 8
Shanks & St. John: Dissociable learning systems the complex trials the target now appeared in the quad- rant diagonally opposite where it had appeared previ- ously. This latter finding rules out nonspecific factors as the locus of the speedup effect. In a second experiment, Lewicki et al. applied deterministic rules only on two out of three sets of seven trials: on the remaining sets, target location on trial 7 was random. Here, a change in the rules only affected RTs in the sets that were rule determined and not in those that were not. Lewicki et al. found that none of their subjects came even close to being able to report any of the rules. In fact "none of the subjects were even able to correctly specify which four out of six simple trials were the crucial ones" (1987, p. 529). Thus we appear to have good evidence ofa dissociation between performance and reports. It is highly doubtful, however, whether these results meet either the Information or the Sensitivity Criterion. With regard to the former, Lewicki et al. required subjects to try to report "at least one pair of co-occurring elements (i.e., a sequence of four target locations in simple trials and the corresponding location ofthe target in the subse- quent matrix-scanning [complex] trial)" (p. 528). Thus subjects were classified as able to report something about the sequence, and hence as aware, only if they were able to specify a complete sequence of four simple trials and one complex trial. The problem with this classification, however, is that to show a speedup in RT, complete knowledge of the sequences was not necessary. Analysis ofthe sequences, for example, shows that even the last simple trial on its own was informative about target location on the seventh trial: if the target was in quadrant A on trial 6, it was twice as likely to be in quadrants A and D on trial 7 as in quadrants B or C. Trial 6 provided a great deal of information on its own about target location on trial 7. Knowledge about trials 4 and 6 provided still more information about target location on trial 7, but if the subjects could report this sort of regu- larity, it would still not have counted as correct according to Lewicki et al.'s criterion. It is true that knowledge of the sequence across the four relevant simple trials pro- vided absolute certainty about the seventh trial, but our point is that considerable amounts ofspeedup in RT could be attributable to fragmentary knowledge of"microrules" that Lewicki et al. would not have counted as evidence of awareness, even if the subjects could articulate them. Turning to the Sensitivity Criterion, we may ask whether the verbal report test is an adequate measure of the subject's awareness in this procedure. We suggest that it is not. First, we cannot be sure that the performance and awareness tests are matched in terms ofthe conscious information they pick up, because quite different re- trieval contexts are provided for the two tests. In the case of RTs, performance is elicited in a context where (1) stimuli are presented on the computer screen, (2) re- sponses are made on the keyboard, (3) a horizontal and a vertical line appear on the screen dividing it into quad- rants, (4) a response is made very soon after the preceding response, and so on. All these cues are pertinent, in that they were present during the learning phase (which is just the RT task). In the case of verbal report, none of these cues is present. Instead, the subject is required to re- trieve the sequence rules from memory, without the aid of any of the aforementioned cues. Second, we have little reason to believe that the verbal report test provides an exhaustive index of conscious information, since there are other tests such as recogni- tion that manifestly detect information left undetected by verbal report tests. For example, Nelson (1978) compared the sensitivity of recognition and verbal recall in the following way. Suppose we have two memory tests, A and B. Subjects learn a list ofitems and are then given test A. Then, test B is applied only to those study items that test A failed to detect. If test B detects any ofthese items, it is said to be more sensitive than test A. It is important also to apply the tests in the reverse order - test B, then test A - and to fail to observe an increase in sensitivity. Using such a procedure, Nelson showed that recognition tests can detect items not detected by free recall tests, but the converse was not true. Hence, recognition is a more sensitive test than free recall, and the latter is therefore not exhaustive. Moreover, note that it is possible that subjects misin- terpret free report questions to mean they should only report rules. They might believe that fragmentary infor- mation is not supposed to be reported. Many researchers have attempted to avoid this problem by asking more and more specific questions about what stimuli may begin or end a sequence, and so on. Such questions are somewhat better from a sensitivity standpoint because they are more specific (and provide more cues), and may be better from an informational standpoint if they ask about the information that subjects actually learn. In sum, we suggest that the Information and Sensitivity Criteria are not met in Lewicki et al.'s (1987) experiment. The default hypothesis - that there is only a single resource of conscious information - may be correct, with less ofthat knowledge being detected by the verbal report test than by the RT task. There is no evidence that the knowledge used to perform the RT task is any different or is in any way acquired independently of the knowledge that the subject's reports are based on. Verbal reports are impoverished compared to task performance simply be- cause less of the available information is retrieved in the test of reportable knowledge. If the subject were given enough retrieval cues, there is every reason to believe that this knowledge could be brought to consciousness and reported it is simply that a normal test of verbal report does not do this. Last, ifsufficient cues could make the information conscious, there is every reason to be- lieve that it was conscious at the time of encoding. It is important to note that we are not denying the empirical fact that performance and verbal reports can be dissociated. On the contrary, we acknowledge that there have been numerous satisfactory demonstrations of this (for example, in Lewicki et al.'s [1987] experiment), and that this has interesting implications for applied psychol- ogy. Subjects' performance indicates that they have learned something, yet they are poor at articulating ver- bally what they have learned. Instead, we are suggesting that this dissociation is only very weak evidence for the claim that the original learning was unconscious, and that it provides no evidence at all for the functional dissocia- tion of conscious and unconscious learning. Its status is exactly the same as the difference that commonly emerges between tests of recall and recognition. For the same reason, amnesic patients' inability to recall informa- tion that an earlier test shows they had learned (e.g., Nissen & Bullemer 1987) is not in its own right evidence 374 BEHAVIORAL AND BRAIN SCIENCES (1994) 17:3
Page 9
Shanks & St. John: Dissociable learning systems of unconscious learning. Since we are claiming that a dissociation between performance and verbal report is not compelling evidence for unconscious learning, we place special weight (below) on studies that have tried to use more sensitive tests of awareness. It is also important to recognize that our criteria do not make unconscious learning undemonstrable. As Bowers (1984) has noted, it is pointless to argue about a possible unconscious process if one's criteria for its existence make it a logical impossibility. But the Information Criterion can readily be met in any study that establishes unequivo- cally what it is that the subject is learning, and the Sensitivity Criterion can be met by tests that adequately reinstate the learning context or that attempt to be ex- haustive with respect to conscious information. Indeed, we will see in section 2.7 below that a replication of Lewicki et al.'s experiment by Stadler (1989) met both of these criteria by using an alternative test of awareness. Furthermore, successful demonstrations of unconscious perception have been possible in experiments that use tests of awareness that meet these criteria (e.g., Merikle & Reingold 1990). In sum, Lewicki et al.'s (1987) experi- ments demonstrate the dangers of asking the wrong questions and of ignoring substantial differences between different types of test. With these considerations in mind, we now turn to other evidence for learning without awareness. In the following sections, we focus on four areas of experimental evidence: conditioning, artificial grammar learning, in- strumental learning, and sequential pattern acquisition. 2.4. Awareness and conditioning 2.4.1. Pavlovian conditioning. We begin with a consider- ation of whether classical or Pavlovian conditioned re- sponses can be acquired in the absence ofawareness of the scheduled contingency of reinforcement. Since many researchers regard conditioning as representing a rela- tively primitive learning system (see Boakes 1989), it is plausible to imagine that learning without awareness can occur in this context. The conclusion from a huge number of studies, however, is quite the opposite: there is no compelling evidence for conditioning in human subjects without awareness of the reinforcement contingency. This conclusion was first reached in a classic review by Brewer (1974), and more recent studies have not changed the situation (see Boakes 1989 Dawson & Schell 1985, for reviews). Such conclusions have not always been heeded, however, because there are still claims in the literature to the effect that conditioning can occur without awareness (e.g., Musen et al. 1990, p. 1074) and is hence an instance of implicit, unconscious learning. There have been two general approaches to examining the relationship between conditioning and awareness. First, some studies have sought to ascertain whether instructions to the subject concerning the nature of the relationship between a cue and a reinforcer affect condi- tioning as measured, for instance, by GSRs. The rationale is that if conditioning is a relatively automatic form of learning that can proceed independently of awareness, then changes in the subjects' conscious beliefs ought to have little effect on their behavior. Using this logic, Grings et al. (1973), for example, presented subjects with two conditioned stimuli (CSs), one of which (CS + ) was followed by a shock unconditioned stimulus (US), and one of which (CS���) was not. At the end of the training stage, CS+ elicited a larger conditioned GSR than did CS ���. Prior to the second stage, subjects were correctly told that the relationship between stimuli and shocks would now be reversed, with shocks following CS��� but not CS + . As has been observed in many other studies, these instructions had a powerful effect on conditioned re- sponding. Grings et al. found that their subjects re- sponded on the first trial of the second stage to CS��� but not to CS + , indicating that their knowledge at least partially controlled their responding. Significantly, the response to CS + , a stimulus that had been paired several times with shock, was no greater than the response to a control stimulus that in the first stage had been presented with uncorrelated USs. Similar results of verbal instruc- tion have been obtained in experiments using phobic stimuli such as pictures of snakes (Davey 1992), where it was once thought that conditioned responding could proceed independently of instructions (e.g., Hugdahl & Ohman 1977). Although such results are unsupportive of the notion that conditioning can proceed without awareness, they do not address the issue directly because awareness itself is not examined. A recent experiment by Lovibond (1992) exemplifies the approach of eliciting measures of aware- ness concurrently with conditioned responses. Lovibond presented subjects with two stimuli (slides depicting flowers or mushrooms), one of which (the CS+) was paired with shock while the other (CS-) was nonrein- forced. Awareness of the relationship between the stimuli and shock was measured in two ways. First, during the learning phase subjects continually adjusted a pointer to indicate their moment-by-moment expectation of shock (note that asking for a rating of shock expectancy does not specifically direct attention to the A-B relationship) and second, at the end of the experiment they were given a structured interview designed to assess their awareness. It should be apparent how the design conforms to the basic procedure depicted in Figure 1, except that there are four learning trials. In Lovibond's experiments, each of trials 2-4 in fact represents a new learning trial, an assessment of whether learning occurred on the preced- ing trial(s), and an assessment of the subject's awareness on the preceding trial(s). The Information Criterion should not raise particular problems here, because there is little doubt that the information the subjects learn (the contingency between the CS and US) corresponds with what the awareness test asks them to report. In each of the experiments, some subjects gave no indication, on either of the tests of awareness, that they associated A with shock to a greater extent than B. Critically, these subjects also gave no hint of stronger conditioned responding to A than to B. For subjects who were aware of the conditioning contingencies, GSRs were stronger to A than to B. Thus, on the basis of these results we would have to conclude that learning about a CS-shock relationship does not occur in the absence of awareness of that relationship. It is also worth noting that Lovibond's experimental design is well suited to demonstrating that our criteria for implicit learning do not make it a logical impossibility. If his results had been different - . some- thing which is simply an empirical matter - the criteria BEHAVIOftAL AND BRAIN SCIENCES (1994) 17:3 375
Page 10
Shanks & St. John: Dissociable learning systems would have been met and implicit learning could have been firmly established. Other studies have tried to mask the CS-US relation- ship and again compare awareness and conditioning. The results have been clear: so long as awareness is measured by an immediate test, usually a recognition test, signifi- cant conditioning only occurs in situations where the subject is aware of the contingency (see Boakes 1989 Dawson & Schell 1985). One recent experiment serves to illustrate the typical result. Marinkovic et al. (1989) pre- sented their subjects with a recognition memory task for odors. On each trial, one odor was presented for 8 sec as a "target," followed in succession by three further odors. Subjects' primary task was to say which of the three was the same as the target. One ofthe three recognition odors was in fact either the CS+ or the CS-. If it was CS+, a shock was presented at its offset skin conductance was measured as the conditioned response. The question of interest was whether acquisition of GSRs could occur without concurrent awareness of the contingency be- tween the CS+ and the shock. Marinkovic et al. mea- sured awareness with a test in which subjects were re- quired to indicate their expectancy of the shock during each odor on a 7-point scale. Because awareness was measured during the CSs, this again represents a concur- rent assessment ofawareness, rather than a post hoc one. The outcome was that differential conditioning to CS + was only observed in subjects classified as aware, indicat- ing that awareness is necessary for conditioning. In addi- tion, Marinkovic et al. obtained some evidence that when conditioned responding did occur, it only started after the onset of awareness. In sum, results from conditioning experiments appear to contradict the notion that this type of learning can proceed without concurrent awareness. For a variety of reasons, some researchers have ques- tioned whether GSRs condition in the same way other responses, such as the eyeblink or salivary reflexes, do. Thus it is worth noting that correspondences between awareness and conditioning seem to occur with other response systems as well (e.g., for eyelid conditioning, Baer & Fuhrer 1982). The conclusion from these studies is clear, and con- firms Brewer's (1974) earlier analysis: Pavlovian condi- tioning, which is often cited as a fundamental form of learning, does not seem to occur in the absence ofaware- ness of the reinforcement contingency. 2.4.2. Evaluative conditioning. Evaluative conditioning refers to a form oflearning that manifests itself in changes in affective response to a stimulus (Martin & Levey 1978). Specifically, it refers to the transfer ofaffect from a US to a CS. Some authors (e.g., Baeyens et al. 1990a Martin & Levey 1987) have suggested that - unlike standard Pavlo- vian conditioning - this form of learning can proceed in the absence of awareness of the CS-US relationship. We briefly review some of the relevant evidence. Baeyens et al. (1990a) presented subjects with 10 repe- titions ofa CS-US pair ofslides, in which the CS slide had been previously evaluated by the subject as affectively neutral and the US slide as either liked, neutral, or disliked. Evaluative conditioning was observed in that on a postconditioning test of affect, the CS slides became affectively positive (liked) if they had been paired with a liked US, negative (disliked) if they had been paired with a disliked US, and they remained neutral ifthey had been paired with another neutral stimulus. As a test of awareness, at the end of the learning phase Baeyens et al. showed the subjects each ofthe CS pictures and asked them to identify which had been the relevant US. If subjects failed to respond correctly they were then asked whether the US had been liked, neutral, or dis- liked. They were classified as "unaware" of the CS-US relationship if they failed on both of these questions. Evidence that evaluative conditioning occurred without awareness emerged in the observation that conditioning was the same for CS-US pairs, regardless of whether or not the subject was aware of the relationship. Of course, the test of awareness may have been an insensitive one. Baeyens et al. accordingly tried to use a more sensitive concurrent measure of awareness. One group of subjects was required to indicate during the 4-sec interval between the onset of the CS and US slides whether they expected a liked, neutral, or disliked US stimulus on that trial. Subjects were classified as "un- aware" ifthey failed to respond correctly on thefinalthree pairings of each stimulus combination. Unfortunately, results from this group undermine the notion of uncon- scious learning. As discussed in section 2.3, subjects could accurately report most ofthe pairings, and for those few they could not report, there was no significant evalua- tive conditioning. Further, in another study, Baeyens et al. (1992) found that groups of subjects given increasing numbers of CS-US pairings showed an increase in both the magnitude of evaluative conditioning and the level of awareness as measured by a postconditioning test. In sum, these studies of evaluative conditioning have failed to show that it can occur unconsciously. (See Shanks & Dickinson 1990, for further criticisms of this research.) Although they are not usually classified as studies of evaluative conditioning, Lewicki's (1986 Lewicki et al., 1989) experiments on the learning of nonsalient contin- gencies can be readily conceived as such. Lewicki pre- sented subjects with photographs ofpeople accompanied by personality descriptions such as "kind" or "capable." For some subjects all "kind" people had long hair and all "capable' people had short hair, while for other subjects the opposite was the case. Lewicki reported that on test trials in which subjects had to affirm or disconfirm state- ments classifying new people as either "kind" or "capa- ble, " they responded "yes" more often when the descrip- tion preserved the study-phase correlation than when it broke the correlation. (They also consistently took longer to answer "yes" when the correlation was preserved.) Lewicki's (1986) subjects were apparently unaware of the relationship between hair length and personality description, because "not one subject mentioned haircut or anything connected with hair" (p. 138) in a test of verbally reportable knowledge. If we take the personality description as being an evaluative response conditioned to the cue ofhair length, the results would again appear to suggest unconscious evaluative learning. However, that conclusion requires us to assume, without any supportive evidence, that the Sensitivity Criterion has been met in these studies. In addition, some of Lewicki's results have proven hard to replicate (see de Houwer et al., in press Dulany & Poldrack 1991) so we must at this stage reserve judgment on whether this form of learning indeed can occur unconsciously. 376 BEHAVIORAL AND BRAIN SCIENCES (1994) 17:3
Page 11
Shanks & St. John: Dissociable learning systems 2.4.3. Conclusions. In experiments examining the rela- tionship between learning and awareness in Pavlovian conditioning, researchers have striven to meet the Sensi- tivity Criterion by using multiple tests of awareness. The Information Criterion does not raise particular problems, because there is little doubt that the information the subjects learn (the contingency between CS and US) corresponds to what the awareness test asks them to report. Thus these studies provide a reasonably good test of the role of awareness in learning. The results we have surveyed give little reason to believe that unconscious learning can occur in these situations. For evaluative conditioning the evidence is less clear-cut, but we have few reservations in suggesting that unconscious evalua- tive learning has not yet been adequately established. 2.5. Awareness In artificial grammar learning tasks Studies of subjects learning artificial grammars present the classic pattern of unconscious learning: subjects clearly learn something about the input domain, but they appear unable either to report the rules of the grammar or to explain their performance. Such studies provide evi- dence of unconscious learning if learning involves rule induction. In this section we examine the evidence for unconscious learning of artificial grammars and conclude that memorization rather than rule induction is the prin- cipal process involved we conclude that evidence for unconscious learning is weak. Later, in section 3.5, we review several further studies that have examined con- scious hypothesis testing in artificial grammar tasks. In a prototypical experiment, Reber (1967) required subjects to memorize either a series of letter strings generated from a small finite-state grammar or a series of strings generated at random (see Fig. 2). Subjects who learned the rule-governed strings then performed a gram- maticality test in which they were asked to accept novel strings that fit the rules and reject novel strings that did not. They categorized 79% of the 44 test strings correctly, which is significantly above chance. Yet these subjects were unable to report the rules they had apparently learned and then used in the grammaticality task. Reber's (e.g., 1967 1989a) account of such grammar- learning results, endorsed by many other investigators since then, proposed that subjects use an unconscious, or implicit, rule-induction mechanism. This mechanism creates a knowledge-base of rules that may be used in a grammaticality task but that is inaccessible to conscious report. As with the other unconscious learning para- digms, we believe that there is another way to interpret the data. We can raise two questions. The first (the Sensitivity Criterion) is whether retrospective verbal re- port is sufficiently sensitive to test for conscious knowl- edge of the rules. More sensitive measures of subjects' knowledge, such as concurrent thinking-aloud protocols and recognition tests might reveal marginal or uncertain knowledge. The second question (the Information Crite- rion) concerns what the subjects are learning from the training strings. If subjects have learned something other than rules, then asking them about rules may lead to erroneous conclusions. On the other hand, if we ask the subjects questions about what they did in fact learn, we may get reasonable answers. It may be that usable knowl- edge is always both consciously learned and consciously (a) END START END (b) MVT VXM MTTV MTTTVT VXVRXR Figure 2. String generator and example strings, (a) Diagram ofafinite-stategrammar. Strings are generated by selecting one of the possible routes through the network, commencing at "start" and continuing until one ofthe "end" symbols is reached. (b) Several example strings generated by the grammar. applied. The experimenter'sjob is to discern what informa- tion subjects are aware ofduring training and whether that information is used to perform the grammaticality task. 2.5.1. Types Of knowledge. The literature has identified three types of knowledge that might be acquired by subjects: rules, memory for whole strings, and knowledge of the frequency and position of substrings, such as pairs of letters. There are several problems with rules. First, it is not really clear what a "rule" would be like: Is it a rewrite rule or a transition graph? How complex can it be, and how many are there? Second, such rules would be very difficult for any but very sophisticated subjects to articulate even if they did explicitly acquire them. Third, it is not clear what sort of mechanism is capable of acquiring such rules, particularly since it must ex hypoth- esi operate outside consciousness. In the face of these questions, it seems sensible to consider other types of knowledge first, and to determine the extent to which they can account for subjects' performance. We return to the evidence for knowledge of rules in artificial grammar learning tasks in section 2.5.3. The picture with regard to memory for whole strings and knowledge of substrings seems reasonably clear. Such knowledge is easy to articulate and there is ample evidence that subjects do acquire this information, be- cause they do articulate it. These types of knowledge are also consistent with a variety of contemporary memory models, such as chunking (Servan-Schreiber & Anderson 1990), distributed memory (Cleeremans & McClelland 1991), and memory-array models (e.g., Estes 1986 Hintz- man 1986 Nosofsky 1986). In addition, these models have been shown to approximate subjects' grammaticality test performance. For example, Dienes (1992) compared a number of these memory models on a set of gram- maticality judgment data and was able to achieve good BEHAVIORAL AND BRAIN SCIENCES (1994) 17:3 377
Page 12
Shanks & St. John: Dissociable learning systems fits, particularly with distributed memory models. We return to this topic in section 3.3. With these different knowledge types in mind, we can now ask what sort of information subjects in artificial grammar learning tasks actually acquire, and whether they are conscious of it. A number of studies have asked these questions using several methods and have asked them at various points during training and testing. Mathews et al. (1989) interrupted subjects periodically during training and asked them to instruct an imaginary confederate how to distinguish the grammatical strings. The trained subjects performed better on the gram- maticality test than did the yoked subjects, suggesting that not all ofthe trained subjects' knowledge was explicit and reportable. This verbal report procedure, however, is essentially uncued recall, and so is unlikely to evoke all of the subjects' knowledge of the grammar. More interest- ing, though, is that the verbal instructions that subjects did report consisted mainly of legal bigrams and other short sequences, sometimes coded by their positions in legal strings. In a study by Servan-Schreiber and Anderson (1990), subjects were trained on grammatical strings using a recall task. For training, the strings were divided into substrings using gaps (T PPP TX VS). Servan-Schreiber and Anderson hypothesized that subjects in all grammar- learning tasks encoded the strings into substring chunks, and the gaps were used to ensure consistent chunkings across subjects. Subjects' written recall preserved these gaps. Servan-Schreiber and Anderson suggested that this phenomenon demonstrates that subjects were in fact encoding the strings as sequences of short strings in accord with the gaps. The subjects' later grammaticality judgments supported this contention as follows. Servan- Schreiber and Anderson constructed ungrammatical strings that consisted of illegal sequences of legal sub- strings (e.g., PPPTXTVS). If subjects were learning just the substrings then these strings would be falsely ac- cepted as legal strings. Indeed, 50% ofthese strings were mistakenly accepted. On the other hand, test strings that violated specific substrings were correctly rejected only 26% of these strings were mistakenly accepted. Both subjects' written protocols during training and their test performance, then, support the hypothesis that subjects learn simple substring information in grammar-learning tasks. That only 50% rather than 100% of the strings containing illegal sequences of legal substrings were ac- cepted does not imply that knowledge of substrings is insufficient to account for performance completely. Com- pared to grammatical strings, these nongrammatical strings (by definition) still contain illegal bigrams (e.g., XT in the example above). In addition, subjects' knowl- edge at test time is clearly incomplete: previously seen grammatical strings were only accepted 70% of the time. Moreover, Servan-Schreiber and Anderson (1990) went on to build a model that acquired chunks and then used them to evaluate the grammaticality of test strings. The model performed at the level oftrained subjects (r = 0.935). This result supports their claim that subjects are learning and using chunks by demonstrating that chunks are learnable and sufficient to account for the level of performance of subjects on the grammaticality task. It is possible that Servan-Schreiber and Anderson's presentation technique, placing gaps in the training strings, biased subjects' learning procedure. A similar experiment by Perruchet and Pacteau (1990), however, used the standard (no gap) format during training and found similar results. Subjects were trained on strings generated from the same grammar that Reber and Allen (1978) used. To test for awareness of simple substrings, trained subjects performed a recognition test on letter pairs present in the training strings. Subjects performed quite well: only 3 out of 25 old pairs were judged less familiar than any new pair. The correlation between recognition scores and the frequency of occurrence of pairs in the training strings was 0.61. According to the results of the recognition test, then, subjects were aware of the relative frequencies of letter pairs. Similarly, Du- lany et al. (1984) concluded that a recognition test of awareness could elicit as much knowledge as was pro- jected in the grammaticality test. Perruchet and Pacteau (1990) also constructed test strings that contained either (1) illegal orders of legal pairs, or (2) illegal pairs. If subjects only had information about legal pairs on which to judge the grammaticality of test strings, then the illegal pairs should have been rejected, but the illegal orders of legal pairs should have been mistakenly accepted as grammatical. This is the pattern of results Perruchet and Pacteau obtained. Dis- criminability, measured in D scores (zero indicates ran- dom responding), was 22 for illegal pairs but only 7 for illegal orders. These results therefore further support the hypothesis that subjects are aware ofand make use ofonly simple substring information. Perruchet and Pacteau then considered a model that used pair frequency information to make grammaticality judgments. The model produced the same level of perfor- mance as subjects, except in one instance. Subjects were sensitive to the beginnings and endings ofstrings, but the model was not. Perruchet and Pacteau concluded that subjects primarily knew letter pairs, but also which pairs could legally start and end strings. Together with the behavior of Servan-Schreiber and Anderson's (1990) chunking model, these results show that simple fragment-memorization systems can be sufficient to ac- count for subjects' imperfect performance on gram- maticality tests. Dienes et al. (1991) also found evidence that subjects were sensitive to more than just pairs. Following training and a grammaticality task, subjects were given incom- plete letter sequences varying in length from zero letters upwards (e.g., VXT. . .) and asked tojudge which single- letter continuations (M? V? X? R? T?) were acceptable at the next location in the string. In this sequential letter dependencies (SLD) task, which was hypothesized to be sensitive to conscious knowledge of the grammar, sub- jects were sensitive to illegal orders of legal pairs even in the middle of strings. Dienes et al. showed that the knowledge that subjects demonstrated in the completion task correlated with their grammaticality judgments and could be used to model the grammaticality judgment data. They found in addition that knowledge gleaned from subjects' free reports also correlated with their gram- maticality judgments, but that less knowledge was re- ported in the free report task than in the continuation task. These correlations suggest that a single knowledge 378 BEHAVIORAL AND BRAIN SCIENCES (1994) 17:3
Page 13
Shanks & St. John: Dissociable learning systems source is tapped by both tasks, but that the free report task, uncued recall, is less sensitive. Reber and Allen (1978) asked subjects to describe retrospectively their learning experience and, concur- rently, tojustify their grammaticalityjudgments. Overall, subjects justified their classifications on 821 out of 2,000 test strings. Subjects reported using a variety of informa- tion in making their grammaticalityjudgments. The viola- tion or nonviolation of bigrams was the most common justification, especially concerning the first bigram of a string. String-initial bigrams accounted for fully 30% of the justifications. Violations of single letters, particularly the first or last letter of a string, and violations of trigram or longer sequences were also reported, as well as recog- nition of and similarity to whole training strings. The grammaticality responses to the remaining unjustified cases presumably consisted of guessing or of knowledge that could not be elicited by verbal report. So much for substring knowledge. Vokey and Brooks (1992 Brooks & Vokey 1991) have argued that subjects can encode whole-item information in addition to sub- string information. They found that the similarity of test strings to specific whole-study strings is an important factor in subjects' grammaticality judgments. When the grammaticality and the similarity of the test strings were varied independently, they were shown to be additive factors on grammaticality judgments. Vokey and Brooks argued that such a result indicates that subjects have encoded the whole strings and can determine similarity relationships between strings. Brooks and Vokey's evidence for whole-string informa- tion raises no particular problems for our interpretation of the artificial grammar learning data, since subjects are clearly aware oftheir whole-string knowledge just as they are aware ofthe substring knowledge the study task, after all, requires the subjects specifically to memorize whole strings. However, as Brooks and Vokey (1991, p. 321) themselves concede, their results can at least in principle be explained without reference to whole-item knowl- edge. Just as grammatical test strings tend to contain more studied bigrams than nongrammatical strings (Per- ruchet & Pacteau 1990), so also a test string that is highly similar to a study string will contain more studied bigrams than one that is less similar. In fact, Vokey and Brooks' results have been challenged by Perruchet (1994), who has shown that both the effect of similarity and the apparently independent effect of grammaticality that Vokey and Brooks obtained can in turn be reduced to substring knowledge. Grammatical test strings tend to contain more substring components that were part of the training strings than do nongrammatical items. The same is true for similar and dissimilar test items, with similar items tending to contain more substring components from the study strings. A final piece of evidence supports the view that gram- maticality judgments are controlled by comparison to memorized substring or whole-item information. On such a view, but not on an abstraction account, it is likely thatjudgments would be relatively susceptible to changes in the superficial characteristics ofthe studied strings. To test this, Whittlesea and Dorken (1993) required subjects to pronounce the training strings from one grammar and to spell the training strings from another grammar. At test, subjects were asked either to pronounce or to spell test strings and to judge their grammatical status. Sub- jects were more likely to assign test strings to grammars when the encoding task matched the task for the test string than when they differed. Test strings that were equally similar to strings in both grammars were assigned to the grammar where the encoding and test tasks matched. Such results, although consistent with the idea that judgments are based on a comparison with a set of items in memory that represent the study items in a relatively unanalyzed form, would clearly not be antici- pated if what was encoded were the underlying abstract rules of the grammar. Our conclusion from this section, then, is that subjects use their memory system to acquire knowledge of (possi- bly) whole strings and (certainly) their parts, and that this simple information is conscious both during acquisition and testing. The results reported by Dulany et al. (1984), Perruchet and Pacteau (1990), and Dienes et al. (1991) show that the knowledge that subjects can consciously retrieve in a recognition test is sufficient to explain their grammaticality judgments. From the evidence we have considered, we do not need to assume the existence ofan additional implicit knowledge base, and conclusions to the contrary have arisen because of failures to meet the Information Criterion. Our interpretation rests on the results of a variety of tests of conscious knowledge that have attempted to address the Sensitivity Criterion. Dienes et al.'s (1991) SLD test, for instance, which required subjects to judge which continuations of a sequence of letters were legal, was actually found in a signal detection analysis to be more sensitive than the implicit grammaticality test itself. Thus, if such a test is accepted as a measure of explicit knowledge, no evidence of a dissociation between learn- ing and awareness emerges. Ofcourse, an alternative (see Reber et al. 1985 and the reply, Dulany et al. 1985) is to argue that performance on these explicit tests is contami- nated by unconscious influences subjects may choose a correct continuation on the SLD test as a result of some implicit knowledge to which they do not have conscious access.2 The problem with this interpretation, however, is that it means we would have to abandon the test as an index of conscious learning and rely instead on verbal reports, in which case it is hard to see how the Sensitivity Criterion can ever be met. And ifthat criterion cannot be met, then how are defenders ofunconscious learning ever going to unconfound test type from sensitivity, and hence establish the existence of unconscious learning? We believe it is rather unlikely that unconscious influ- ences play a significant role in the SLD test. Presenting subjects with a letter sequence (e.g., VXT. . .)andasking them to judge, under no time pressure, whether a given letter (e.g., M) could continue the sequence would seem to be a prototypical example ofa task requiring conscious reflection, even if it involves mere conscious recollection of studied strings. Nevertheless, to claim that the SLD test is only sensitive to conscious information does re- quire adopting what Reingold and Merikle (1988) call the "exclusiveness" assumption: the assumption that perfor- mance on a test ofawareness is only affected by conscious influences. This, of course, is a very strong assumption and one that may well be incorrect. BEHAVIORAL AND BRAIN SCIENCES (1994) 17:3 379
Page 14
Shanks & St. John: Dissociable learning systems 2.5.2. Learning systems. In addition to the question of awareness, a second issue concerns whether whole item, bigram, and possibly rule information are acquired by a single learning system or by separate systems. If they are acquired by separate systems, perhaps those systems interfere with each other's operation? To examine this possibility, Reber and Allen (1978) manipulated the train- ing task. Subjects either observed the strings without any explicit task (observation training), or they performed a paired-associate task, where each string was paired with a different city name. The idea was that the paired- associate task would require better item encoding, thereby facilitating item knowledge but potentially inhi- biting other learning processes. The paired-associate task produced several significant differences from the observation task. Overall, paired- associate subjects were less accurate on their gram- maticality judgments: 72.4% versus 81.2% accurate for observation subjects. Paired-associate subjects produced twice as many recognition justifications as did observation subjects (77 vs. 40), and paired-associate subjects' proba- bility of making consistent errors suggested they were more likely to develop unrepresentative knowledge than were observation subjects. Clearly, the two training tasks affected the quantity of whole-item and substring knowledge that was acquired, but the underlying learning processes do not appear to be in opposition. The verbal reports show that both groups justified their responses with the same knowledge sources, but to differing degrees. It appears, then, that whole-item learning is compatible with substring learn- ing. Vokey and Brooks (1992) examined a range of encod- ing tasks that produce differences in the extent of item knowledge, but they also found no reliable interference between item knowledge and substring knowledge. Finally, Dienes et al. (1991) required subjects to gener- ate random digits during training. Their goal was to test whether this task would interfere selectively with sub- jects given explicit instructions to search for rules that describe the study strings, but not with subjects given implicit instructions simply to observe the study strings. Instead, Dienes et al. found equivalent reductions in learning for both implicitly and explicitly instructed subjects. 2.5.3. Implicit rule induction. Although the considerable evidence presented above supports the conclusion that subjects' knowledgeconsists ofsimple substrings (orwhole strings), there are two further pieces of evidence that support the conclusion that subjects learn rules. The first piece of evidence supporting rule learning was reported by Reber and Lewis (1977). Subjects were trained on a subset ofstrings and then solved "anagrams" based on the remaining strings generated from the grammar - that is to say, they took strings of letters and rearranged them to make grammatical strings. The frequencies of bigrams produced by subjects in the anagram task were tabulated and compared with the frequencies of the bigrams in the training set and in the full set of grammatical strings. If subjects were learning bigram frequencies from the train- ing strings, the correlation between the frequencies of bigrams in the training strings and in the solved anagram strings should be high. While this was the case, Reber and Lewis found that the correlation between the frequencies of bigrams in the solved anagrams and in the whole grammar was actually higher. This result suggests that the subjects went beyond the training set to learn the rules of the grammar. Perruchet et al. (1992), however, argued that Reber and Lewis's (1977) result must hold on statistical grounds alone. The anagrams demand the production of certain bigrams and not others, in fact, exactly those bigrams that are underrepresented in the training set. Suppose, for example, that VT is a bigram in the grammar that is underrepresented among the training strings. VT must then be overrepresented among the solved anagram strings since the training and correctly solved anagram strings together constitute the complete set ofgrammati- cal strings. It is no wonder, then, that the correlation between the frequencies ofanagram bigrams and training bigrams is low and that the correlation between anagram bigrams and the full grammar bigrams is higher. Per- ruchet et al. went on to demonstrate this fact empirically by training subjects only on the individual bigrams from the training strings. Under these circumstances, subjects could not be learning rules because they only saw bi- grams, yet as with Reber and Lewis's subjects the fre- quencies of their anagram bigrams also correlated better with the full grammar bigrams than with the training string bigrams. The original conclusion, therefore, that subjects go beyond the training strings to learn rules appears to have been an artifact of the experimental design. The second and more compelling piece ofevidence for abstraction is the fact that subjects show some degree of transfer to strings governed by the same underlying grammar, but formed from a new set of letters or from a completely new set of stimuli such as tones. Reber (1969) trained subjects to recall grammatical strings, and when he switched to a new set of letters, subjects showed no increase in recall errors. This result suggested that sub- jects had learned abstract rules that were easily instanti- ated with different letters. More impressively, Altmann et al. (in press) required subjects to observe a set of letter strings, generated from the grammar shown in Figure 2, prior to making grammaticality judgments concerning sequences oftones. Some of the tone sequences could be generated from the grammar by substituting a tone for a letter (e.g., middle C for the letter M). Altmann et al. found that exposure to letter sequences allowed gram- matical and nongrammatical tone sequences to be dis- criminated at better-than-chance levels. Although the improvement was generally small (about 5% increase in correct classifications), this result strongly suggests that at least some aspects of the abstract structure of the letter sequences had been isolated and were available to aid classification of the tone sequences. It is important to note that the change of stimulus set did have a detrimental effect on performance, however. Compared to a situation in which the study and test items were from the same set (both letters or both tones), classification performance was significantly impaired when the study and test sets differed. Thus, abstract knowledge was plainly not the sole source of information that subjects were relying on - specific memorized frag- ments or strings must also have been playing a role. A study by Mathews et al. (1989) confirms this conclusion. In Mathews et al.'s study, over a series oftraining sessions 380 BEHAVIORAL AND BRAIN SCIENCES (1994) 17:3
Page 15
subjects were trained either on a single-string set or on different sets generated from the same underlying gram- mar. Subjects in the same set condition learned better, and a final switch to a new set doubled the error rates in the single-set training condition. Such a result would not be expected ifan abstract set ofunderlying rules were the sole factor guiding classification, because the rules would apply equally to the new and to the original letter set. What is the significance ofthese results for unconscious learning? To the extent that subjects might be poor at describing what they have abstracted, such results may imply that unconscious learning is taking place. But given the rather small improvement in classification perfor- mance that results from training and testing on different sets of items, it is quite likely that what is abstracted is fairly limited (e.g., only two initial symbols are legal, the first two symbols ofa string cannot be the same, etc.), and it is quite possible that subjects, ifasked, would be able to report such simple regularities. In sum, although the data from transfer studies do suggest that some aspects of the underlying structure can be abstracted, from the point of view of unconscious learning the significance of these findings has yet to be established. 2.5.4. Conclusions. These studies indicate that relatively simple information is to a large extent sufficient to account for subjects' behavior in artificial grammar learning tasks. In addition, and most important, this knowledge appears to be reportable by subjects. Appreciable knowledge of the grammar does not seem to be acquired by explicit hypothesis testing or other complex analytic processes (although we return in sect. 3.2 to consider some rather different cases where grammars appear to be learned explicitly). Instead, knowledge seems to be mainly accu- mulated over training by simple memory mechanisms that collect frequency statistics on bigrams, slightly longer sequences, and possibly whole items. 2.6. Awareness in instrumental learning tasks In contrast to the conditioning and artificial grammar studies described above, which arrange relationships be- tween external cues, instrumental tasks establish some contingency between an action the subject performs and an associated outcome. Learning is measured as a change across trials in the propensity to perform the action. Naturally, the question we may again ask is whether such learning can occur without awareness. As in his review of Pavlovian learning studies, Brewer (1974) concluded that the answer to this question is no. There have recently been some further investigations of the role of awareness in instrumental learning: we consider results separately from tasks in which the instrumental contingency is simple or more complex. By "simple" we mean any task in which there is ostensibly just one action available to the subject. 2.6.1. Simple instrumental learning tasks. Svartdal (1989 1991) has reported a number of studies in which subjects are led to believe that there is a relationship between a reinforcer and one aspect of responding, when in fact the critical variable is some other aspect of responding. For example, Svartdal (1991) presented subjects with brief trains of between 4 and 17 auditory "clicks." Subjects Shanks & St. John: Dissociable learning systems immediately had to press a response button exactly the same number of times and were instructed that feedback would be presented when the number ofpresses matched the number of clicks. In reality, however, feedback was contingent on the rate ofresponding: for some subjects, it was given when the interresponse times (IRTs) were lower than in a baseline phase, while for others it was given when IRTs were higher. Svartdal (1991) obtained evidence of learning, in that IRTs adjusted appropriately to the reinforcement contin- gencies, but subjects seemed to be unaware that it was the rate ofresponding that was important. A structured ques- tionnaire revealed no evidence ofawareness ofthe contin- gency between response rate and feedback in subjects whose response rate had adjusted appropriately. Such demonstrations appear at first glance to be quite compelling, especially as the contingency to be learned is such a simple one. It is unclear, however, that the Infor- mation Criterion is met in these and similar studies, because it is very difficult to rule out the possibility that subjects acquire "correlated" hypotheses about the rein- forcement contingency that are incorrect from the experi- menter's point of view but happen to produce response profiles that are difficult to distinguish from those gener- ated by the correct hypothesis. For example, suppose subjects learn that resting their hand in a certain position increases reinforcement rate. This could be a true experi- enced contingency if that hand position was conducive to a fast or slow response rate. Such an "incorrect" hypoth- esis would generate behavior that was very similar to what would be produced by the correct hypothesis, yet a subject who reported hand position as the crucial variable would be regarded by the experimenter as "unaware" of the reinforcement contingency. Although such a criticism is undoubtedly post hoc, there is good evidence of subjects' behavior being under the control ofsuch correlated hypotheses. In the 1950s, a number of studies asked subjects to generate words ad libitum and established that the probability with which they would produce, say, plural nouns was increased if each such word was followed by the experimenter saying "umhmm" (e.g., Greenspoon 1955) as with Svartdal's (1991) experiment, this result occurred in subjects appar- ently unable to report the reinforcement contingency. However, in an elegant study, Dulany (1961) proved that subjects were hypothesizing that reinforcement was con- tingent on generating a word in the same semantic cate- gory as the previous one. Although incorrect, this hypoth- esis was correlated with the true one, in that ifthe subject said "emeralds" and was reinforced, then staying in the same semantic category meant they were more likely to produce another plural noun ("rubies") than ifthey shifted categories. Thus the subjects were perfectly aware of the contingency that was controlling their behavior, namely, the contingency between staying in the same semantic category and reinforcement. In sum, even ignoring possible insensitivity in the test of verbal awareness, results such as Svartdal's (1991) cannot be taken as conclusive evidence of unconscious learning. Subjects may learn a rather different contin- gency from that explicitly programmed by the experi- menter, and the Information Criterion may hence fail to be met. The problem is particularly worrisome in operant studies because, by definition, the experimenter has little BEHAVIORAL AND BRAIN SCIENCES (1994) 17:3 381