In a recent paper on mixed-effects mod-els for confirmatory analysis, Barr et al. (2013) offered the following guideline for testing interactions: " one should have by-unit [subject or item] random slopes for any interactions where all factors com-prising the interaction are within-unit; if any one factor involved in the interac-tion is between-unit, then the random slope associated with that interaction can-not be estimated, and is not needed " (p. 275). Although this guideline is techni-cally correct, it is inadequate for many sit-uations, including mixed factorial designs. The following new guideline is there-fore proposed: models testing interactions in designs with replications should include random slopes for the highest-order com-bination of within-unit factors subsumed by each interaction. Designs with replica-tions are designs where there are mul-tiple observations per sampling unit per cell. Psychological experiments typically involve replicated observations, because multiple stimulus items are usually pre-sented to the same subjects within a single condition. If observations are not repli-cated (i.e., there is only a single obser-vation per unit per cell), random slope variance cannot be distinguished from random error variance and thus random slopes need not be included. This new guideline implies that a model testing AB in a 2 × 2 design where A is between and B within should include a random slope for B. Likewise, a model test-ing all two-and three-way interactions in a 2 × 2 × 2 design where A is between and B, C are within should include random slopes for B, C, and BC. The justification for the guideline comes from the logic of mixed-model ANOVA. In an ANOVA analysis of the 2 × 2 design described above, the appropriate error term for the test of AB is MS UB , the mean squares for the unit-by-B interac-tion (e.g., the subjects-by-B or items-by-B interaction). For the 2 × 2 × 2 design, the appropriate error term for ABC and BC is MS UBC , the unit-by-BC interaction; for AB, it is MS UB ; and for AC, it is MS UC . To what extent is this ANOVA logic applicable to tests of interactions in mixed-effects models? To address this question, Monte Carlo simulations were performed using R (R Core Team, 2013). Models were estimated using the lmer() func-tion of lme4 (Bates et al., 2013), with p-values derived from model comparison (α = 0.05). The performance of mixed-effects models (in terms of Type I error and power) was assessed over two sets of simulations, one for each of two differ-ent mixed factorial designs. The first set focused on the test of the AB interaction in a 2 × 2 design with A between and B within; the second focused on the test of the ABC interaction in a 2 × 2 × 2 design with A between and B, C within. For sim-plicity all datasets included only a single source of random effect variance (e.g., by-subject but not by-item variance). The number of replications per cell was 4, 8, or 16. Predictors were coded using devi-ation (−0.5, 0.5) coding; identical results were obtained using treatment coding. In the rare case (∼2%) that a model did not converge, it was removed from the analy-sis. Power was reported with and without adjustment for Type I error rate, using the adjustment method reported in Barr et al. (2013). For each set of simulations at each of the three replication levels, 10,000 datasets were randomly generated, each with 24 sampled units (e.g., subjects). The depen-dent variable was continuous and nor-mally distributed, with all data-generating parameters drawn from uniform distribu-tions. Fixed effects were either between −2 and −1 or between 1 and 2 (with equal probability). The error variance was fixed at 6, and the random effects variance/covariance matrix had variances ranging from 0 to 3 and covariances cor-responding to correlations ranging from −0.9 to 0.9. For the 2 × 2 design, mixed-effects models with two different random effects structures were fit to the data: (1) by-unit random intercept but no random slope for B (" RI "), and (2) a maximal model including a slope for B in addi-tion to the random intercept (" Max "). For comparison purposes, a test of the interaction using mixed-model ANOVA (" AOV ") was performed using R's aov() function. Results for the test of the AB interac-tion in the 2 × 2 design are in Tables 1 and 2. As expected, the Type I error rate for ANOVA and maximal models were very close to the stated α-level of 0.05. In contrast, models lacking the random slope for B (" RI ") showed unacceptably high Type I error rates, increasing with the number of replications. Adjusted power was comparable for all three types of anal-yses (Table 2), albeit with a slight overall advantage for RI. The test of the ABC interaction in the 2 × 2 design was evaluated under four different random effects structures, all including a random intercept but varying in which random slopes were included. The models were: (1) random intercept only (" RI "); (2) slopes for B and C but not for BC (" nBC "); (3) slope for BC but not for B nor C (" BC "); and (4) maximal (slopes for B, C, and BC; " Max "). For the test of the ABC interac-tion, ANOVA and maximal models both
CITATION STYLE
Barr, D. J. (2013). Random effects structure for testing interactions in linear mixed-effects models. Frontiers in Psychology, 4. https://doi.org/10.3389/fpsyg.2013.00328
Mendeley helps you to discover research relevant for your work.