Regression as the Univariate General Linear Model: Examining Test Statistics, p values, Effect Sizes, and Descriptive Statistics Using R

undefined; undefined; undefined; undefined; undefined; Kim Nimon; Julia Berrios; Greggory Keiffer; Mandolen Mull; Jon Musgrave

Journal ArticleOPEN ACCESS

Regression as the Univariate General Linear Model: Examining Test Statistics, p values, Effect Sizes, and Descriptive Statistics Using R

et al.

General Linear Model Journal (2017) 43(1) 50-82

DOI: 10.31523/glmj.043001.004

N/ACitations

8Readers

Abstract

This paper presents regression as the univariate general linear model (GLM). Building on the work of Cohen (1968), McNeil (1974), and Zientek and Thompson (2009), the paper uses descriptive statistics to build a small, simulated dataset that readers can use to verify that multiple linear regression (MLR) subsumes the univariate parametric analyses in the GLM. Unlike other related works, we provide R syntax that demonstrates how MLR produces equivalent test statistics, p values, effect sizes, and descriptive statistics when compared to the univariate analyses that MLR subsumes. The paper diverges from Zientek and Thompson by presenting an expanded hierarchy for MLR and demonstrating why only the case of the chi-square test of independence where the criterion variable is dichotomous, and not the general case, is subsumed by MLR. Readers will find an accessible treatment of the GLM as well as R syntax, which they can use to report descriptive statistics, p values, and effect sizes associated with the univariate parametric statistics in the GLM. n 1968, Cohen presented multiple linear regression (MLR) as the univariate general linear model (GLM). Since that time, Cohen's work has been extended to consider canonical correlation as the multivariate GLM (see Knapp, 1978) and structural equation modeling as an even more general case of the GLM (see Bagozzi, Fornell, & Larcker, 1981). As noted by Graham (2008), The vast majority of parametric statistical procedures in common use are part of [a single analytic family called] the General Linear Model (GLM), including the t test, analysis of variance (ANOVA), multiple regression, descriptive discriminant analysis (DDA), multivariate analysis of variance (MANOVA), canonical correlation analysis (CCA), and structural equation modeling (SEM). Moreover, these procedures are hierarchical in that some procedures are special cases of others. (p. 485). In addition to the hierarchical nature of the GLM is the concept that the subsumed analyses share three characteristics. Analyses in the GLM implicitly or explicitly are correlational in nature, yield variance accounted effect sizes, and produce scores on latent variables that are derived by applying weights to measured variables (Thompson, 2006, p. 360). Although the characteristics of the GLM seem to be straightforward, graduate students and emerging scholars are likely to benefit from being able to verify the hierarchical nature of the GLM through illustrations that compare univariate statistical analyses to MLR analyses. Not only has active learning been shown to be beneficial when learning statistics (White, 2015), research (e.g., Henson, Hull, & Williams, 2010) indicates that many graduate students and emerging scholars may have insufficient quantitative proficiency. Therefore, we offer an illustration of MLR as the univariate GLM that considers the similarities and differences in the test statistics, p values, effect sizes, and descriptive statistics generated. Namely, we consider ANCOVA, ANOVA, r, repeated measures ANOVA (RM ANOVA), independent samples t-test, paired-samples t-test, and single-sample t-test. Our interest in developing this work is similar to other methodologists who seek to " improve statistical practice, and thereby, improve the quality of the knowledge produced by the legions of researchers around the world who use these techniques on a daily basis " (Osborne, 2013, p. 1). We also make five unique contributions to the literature. We demonstrate MLR as the univariate GLM for parametric analyses using R, which is a free statistical programming language that is gaining popularity in social science research and that is compatible with Unix, Windows, and Mac operating systems (R Development Core Team, 2017). Prior contributions (e.g., Zientek & Thompson, 2009) have used commercial statistical software packages (e.g., SPSS). Second, we demonstrate that the hierarchical I Univariate General Linear Model General Linear Model Journal, 2017, Vol. 43(1) 51 Figure 1. Multiple linear regression (MLR) as the univariate general linear model. Dotted line indicates that χ 2 test of independence is only assumed by MLR in the case of a dichotomous dependent variable. Illustrative models identified in [ ]. See formula in stats package (R Development Core Team, 2017) for formatting of model formulae. nature of the univariate parametric statistical analyses is not as flat as portrayed in Zientek and Thompson (p. 344). Namely, we show that ANOVA and r subsume the independent samples t-test. Not only is it important to show that these analyses (i.e., ANOVA, r, and independent samples t-test) are mathematically equivalent, demonstrating that r subsumes the independent samples t-test helps undo the misconception that correlation never implies causality and that causality is a function of design, not statistics (cf. Huck, 2012). Third, we demonstrate that RM ANOVA is subsumed by MLR and subsumes the paired-samples t-test. Fourth, we demonstrate that the single-sample t-test is subsumed by MLR. Finally, we demonstrate why the general case for the chi-square test of independence cannot be subsumed by MLR and that only in the case of a dichotomous dependent variable does MLR subsume the chi-square test. Therefore, the hierarchy of analyses subsumed by MLR presented in Figure 1, which serves as a framework for our paper, diverges from the hierarchy presented by Zientek and Thompson (p. 344) in important ways.

Cite

CITATION STYLE

APA

Nimon, K. … Musgrave, J. (2017). Regression as the Univariate General Linear Model: Examining Test Statistics, p values, Effect Sizes, and Descriptive Statistics Using R. General Linear Model Journal, 43(1), 50–82. https://doi.org/10.31523/glmj.043001.004

Regression as the Univariate General Linear Model: Examining Test Statistics, p values, Effect Sizes, and Descriptive Statistics Using R

Abstract

Cite

Register to see more suggestions