Validity of personnel decisions: ...
Journal of Applied Psychology 1989, m 74, No, 3,478-494 Copyright 1989 by the American Psychological Association, Inc. 0021-9010/89/S00.75 Validity of Personnel Decisions: A Conceptual Analysis of the Inferential and Evidential Bases John F. Binning Illinois State University Gerald V.Barrett University of Akron Issues common to both the process of building psychological theories and validating personnel deci- sions are examined. Inferences linking psychological constructs and operational measures ofcon- structs are organized into a conceptual framework, and validation is characterized as the process of accumulating various forms ofjudgmental and empirical evidence to support these inferences. The traditional concepts of construct-, content-, and criterion-related validity are unified within this framework. This unified view of validity is then contrasted with more conventional views (e.g., Uni- form Guidelines, 1978), and misconceptions about the validation of employment tests are exam- ined. Next, the process of validating predictor constructs is extended to delineate the critical infer- ences unique to validating performance criteria. Finally, an agenda for programmatic personnel selection research is described, emphasizing a shift in the behavioral scientist's role in the personnel selection process. Demonstrating the validity of decisions based on psychologi- cal assessment procedures is of fundamental importance to per- sonnel and other applied psychologists. Furthermore, few would argue with the fact that generating and articulating valid- ity evidence is a complex process. To fully appreciate this com- plexity, it is important to realize that conceptions of validity have evolved over the years through the meldingof legal, techni- cal, and practical concerns about the quality and utility of per- sonnel decisions. Inevitably, differences of interpretation and opinion have arisen as each constituency has viewed these myr- iad concerns from uniquely important perspectives. Perhaps equally inevitable, however, is the confusion that has grown out of these differences. Because this confusion ultimately limits the effectiveness of practitioners and theorists alike, the need for greater clarity cannot be overestimated (Guion, 1987 Landy, 1986 Tenopyr, 1986). This article is based on the premise that all validity issues discussed in personnel contexts have some conceptual counter- part in the general process of theory development (Landy, 1986). Moreover, various departures from this "ideal" process have led to myopic,if not erroneous, conceptions of validity.To elucidate how these departures have distorted conceptions of validity, the article is divided into four major sections. In the section that immediately follows, we review how the general A shorter version of this article was presented at the annual meeting of the Academy of Management, Anaheim, California August 1988. We would like to thank the following people for helpful comments on earlier drafts of thisarticle: JerFFacteau, Mel Goldstein, Steven Landau, Pat Maloney, Tim Mooney, John Pryor, Pat Raymark, Glenn Reeder, Bob Rumery, Jay Thomas, Karen Williams, Kenneth \ork, and two anonymous reviewers. Correspondence concerning this article should be addressed to John F. Binning, Department of Psychology, Illinois State University,Nor- mal, Illinois 61761. concept of scientific validity implies a simple model in which constructs and measures of such are inferentially linked. In the next section, we suggest that in personnel selection contexts, a conceptually truncated adaptation of this model often implic- itly guides the validation of predictor-criterion relationships. This truncation has for years had an undesirably limiting in- fluence on conceptions of validity. Perhaps its most damaging effect has been the relative neglect of criterion validity concerns. In remedial response to this, a third model is presented. This model is designed to restore and clarify the severed criterion portions of the original. Finally, suggested strategies for elabo- rating the proposed model and broadening conceptions of vali- dation are discussed. Validation Vis-a-Vis Theory Development It is now commonly accepted that validity is not a character- istic of a test or assessment procedure but, instead, of inferences made from test or assessment information (Cronbach, 1970 Guion, 1980,1987 Landy, 1986 Society for Industrial and Or- ganizational Psychology, 1987 American Psychological Associ- ation, 1985). An inference is valid to the extent that it is sup- ported by sound evidence. Expressed alternatively by Nunnally (1978), "one validates not a measuring instrument but rather some use to which a measuring instrument is put" (p. 87). Logi- cally, therefore, to examine the concept of validity in personnel decision making, it is important to delineate (a) the types of inferences involved in applied personnel decision situations and (b) the nature of evidence that can be used to support such in- ferences. Inferences Linking Psychological Constructs Following Landy's (1986) lead, it is both appropriate and im- portant to view the process of validating a particular selection procedure as a special case of hypothesis testing and scientific 478
VALIDITY OF PERSONNEL DECISIONS 479 theory building. The following rudimentary characterization of the theory-buildingprocess will provide a backdrop for further discussion of some important validity concepts. Psychological constructs are labels for clusters of covarying behaviors. In this way, a virtually infinite number of behaviors is reduced to a system of fewer labels, which simplifies and economizes the exchangeof information and facilitates the pro- cess of discovering behavioral regularities. For example, it is less cumbersome to refer to the relation between verbal and quanti- tative ability than to the abilities to add, subtract, multiply, and divide numbers, fractions, decimals, and so forth, and their re- lations to reading, spelling, understanding word meanings, and soon. Putting aside the perennial debate over the objective existence of psychological traits and psychologists' constructs (Cronbach & Meehl, 1955 Kane, 1982 Loevinger, 1957 Messick, 1981 Nunnally, 1978), viewed pragmatically, a construct is merely a hypothesis about which behaviors will reliably covary. Con- structs are heuristic devices for describing behavioral domains. Of course, construct domains can vary in being large versus small, specific versus general, and fuzzy versus clearly defined (Guion, 1987 Nunnally, 1978). Also, constructs become the object of conceptual scrutinyin their own right. In other words, psychologists hypothesize both (a) whether certain behaviors will covary and (b) whether the clusters of covarying behaviors (constructs) tend to covary in meaningful ways. In this general sense, the terms construct validation and theory development imply the same basic process. Both refer to the process of iden- tifying (and often reifying) constructs by developing measures of such constructs and examining relationships among the vari- ous measures. Nunnally (1978) delineated the four inferences that form the core of this construct validation process. These four inferences logically bind the components of the model presented in Figure 1. One can attempt to determine whether an inferred relation- ship between two constructs (e.g., anxiety and manual dexter- ity) exists by developing measures or causal conditions for each (labeled Jf and Y,respectively). It is important to emphasize that these measures are nothing more than procedures for sampling behaviors within the respective construct domains. The follow- ing four inferences then follow logically 1. X and Y relate in some specified way. 2. X is a measure of (or treatment that induces) anxiety. 3. Anxiety and manual dexterity are causally related in some spec- ified way. 4. Yis a measure of (or treatment that induces) manual dexterity. Even though these four inferences are interrelated, a single ex- periment cannot validate all four inferences simultaneously. In fact, Inference 1 is the only one that can be empirically tested directly. That is, we can use our measures of anxiety and man- ual dexterity to derive scores that are subsequently found to relate either experimentally or correlationally. These data serve as empirical evidence of the veridicality of Inference 1. From this one empirical finding, therefore, it would be necessary to infer the truth or falsity of the others, because Inferences 2, 3, and 4 each link an observable measure with a hypothetical con- struction. Of course, merely finding a correlation between X Measure of Anxiety x 1 2 1 1 Measure of Manual Dexterity Y 1 ' \ 4 Figure 1. Critical inferential linkages in the theory-building process. and Y leaves open several alternative interpretations of possible relationships. For example, anxiety and manual dexterity are perhaps both related to some third construct? To provide incontrovertible proof that the four inferences are correct, it would be necessary to empirically demonstrate three of the inferences. If three of the linkages are unequivocally proven correct, then complete confidence in the fourth would be justified. However, because this direct empirical proof is im- possible (Nunnally, 1978), typical practice isto assume that two of the three inferences (2,3, or 4) are correct and this, combined with empirical evidence of Inference 1, allows a valid conclu- sion regarding the remaining inference. Generally, these con- clusions about construct validity are strengthened in those situ- ations in which the truth of the assumptions is obvious to every- one scrutinizing the conclusions drawn. Specifically, we are more confident that a test validly measures a given construct if (a) the behavioral domain of the other construct is explicitly defined and (b) the assumption of a relationship between the two constructs is unarguable (Nunnally, 1978). The Three Faces of Construct Validity To avoid confusion, it is important to realize that the term construct validity has thus far been used to describe the sound- ness of evidence supporting any of the four inferences. Thus, the term is being used in its most general sense in reference to construct-construct links (Inference 3), construct-measure links (Inferences 2 and 4), and measure-measure links (Infer- ence 1). However, what have traditionally been of particular concern to research psychologists and psychometricians are construct-measure links (i.e., Inference 2 or 4). In the heyday of trait psychology, construct validity often referred to whether a given test or measurement procedure allowed accurate infer- ences about an individual's standing on a psychological con- struct of particular interest (D. T. Campbell & Fiske, 1959 Cronbach & Meehl, 1955 Ebel, 1977 Guion, 1980 Messick, 1980). These two uses of the term construct validity (equal con- cern for Inferences 1, 2, 3, & 4 vs. primary concern for only Inference 2 or 4) are clearly congruent. In fact, the difference in perspective was recognized by Loevinger (1957) when she referred to the validity of the construct versus the validity of the
480 JOHN F. BINNING AND GERALD V. BARRETT test as a measure of the construct (Landy, 1986). If theory build- ing is of primary interest, Inferences 1, 2, 3, and 4 are all of equal importance. On the other hand, in specific situations (e.g., development of a new test), Inference 2 (or 4) is emphasized. This becomes potentially more confusing because the termcon- struct validity has a somewhat different connotation in the per- sonnel selection literature. Here, it has been frequently used to describe a specific evidential approach for justifying a specific measure-construct link (i.e., the predictor-performance link- age portrayed in Figure 2) by documenting underlying con- struct-construct and construct-measure links (Schwab, 1980). The inferences implicated in this latter meaning are described in detail in the next section. Perhaps the most important issue at this juncture is to realize that these various meanings of the term construct validity are nothing more than different views of the same logical system, with varying emphasis on different inferences. Examining Traditional Conceptionsof Validity A common conception of the personnel selection process in- volves (a) analysis of the job to determine (b) a performance domain, denned in terms of job behaviors or outcomes, which then guides (c) the selection or development of certain assess- ment procedures, which make possible (d) predictions about the likelihoodthat applicants will perform thejob with a certain degree of proficiency, and then subsequently (e) evaluatingindi- vidual performance by some operational criterion measure (e.g., Cascio, 1987 Muchinsky, 1987 Society for I/O Psychol- ogy, 1987 APA, 1985 Uniform Guidelines, 1978). This pro- cess implies a framework, presented in Figure 2, which parallels Figure 1 in many respects. The framework represented in Fig- ure 2 portrays the following inferences: 5. Predictor measurements relate to criterion measurements. 6. The predictor measure is an adequate sample from a psycholog- ical construct domain. 7. The predictor construct domain overlaps with the performance domain. 8. The criterion measure is an adequate sample from the perfor- mance domain. 9. The predictor measure is related to the performance domain. These inferences serve to link the components in Figure 2 anal- ogously to the inferences in Figure 1. It is important to realize, however, that in the transition from Figure 1 to Figure 2, two important differences have arisen. First, an additional mea- sure-construct linkage (Inference 9) has been created, linking the predictor measure and the performance domain. Second, rather than equal emphasis being placed on all inferences, this additional measure-construct (Inference 9) link has taken on greater relative importance. The implications of this way of thinking for understanding the validation process are explored in the discussion that fol- lows. Before detailing how validation of personnel selection de- cisions is merely a special case of the more general construct validation process, it would be helpful to discuss the process of conceptualizing and constructing behavioral domains. Figure?. A common conception of the inferences for personnel selection. Contrasting Predictor Construct Domains and Performance Domains In an attempt to simplify the virtually infinite number of be- haviors that can be exhibited by human beings, psychologists attempt to identify naturally occurring clusters, then construct labels for them, and investigate the covariance between them. Predictor constructs, therefore, represent those clusters of co- variant behaviors identified through psychological research and constructed to enhance our general understanding of behavior. In contrast to the psychologists' search for naturally occur- ring behavioral construct domains across myriad situations, or- ganizational designers in effect create behavioral domains to en- hance their understanding and prediction of job behavior. In fact, it is important to realize that from our pragmatic view- point, a job performance domain is a construct, albeit in a con- ceptually different sense than is usually implied in the psycho- logical literature. Nonetheless, the performance of any job in any organization is a cluster of interlocked and covariant behav- iors, and this cluster consists of a subset of all possible behaviors necessary for the organization to accomplish its broader goals and objectives (Weick, 1979). Just as psychological constructs represent behavioral domains, performance associated with a job (or distinguishableaspects of job performance) represents a behavioral domain. Performance domains are conceptually distinct from predic- tor constructs in that the universe to be sampled is delineated differently. Construct domains on the predictor side are con- ceived of by the research psychologist with reference to some theoretical framework developed to explain general regularities in human behavior. Performance domains are determined, or at least influenced, by organizational decision makers and selec- tion specialists collaborating to translate broad organizational objectives into normative statements of valued behaviors and outcomes. The overriding reason for constructing behavioral domains on both the predictor and the performance side is the parceling of myriad behaviorsinto meaningful clusters to enhance under- standing and communication. However, this parceling process is different on the predictor versus the performance side because of differences in (a) the conceptualization of predictor domains versus performance domains, (b) the specific purposes for sped-