THE USE OF SPECIAL MATRIX OPERATORS IN STATISTICAL CALCULUS

Albert E. Beaton

Journal ArticleOPEN ACCESS

THE USE OF SPECIAL MATRIX OPERATORS IN STATISTICAL CALCULUS

Beaton A

ETS Research Bulletin Series (1964) 1964(2)

DOI: 10.1002/j.2333-8504.1964.tb00689.x

N/ACitations

8Readers

Abstract

The availability of high speed computers has not only allowed persons in educational research to perform larger statistical analyses but also to think differently about their problems. The purpose of this thesis is to reexamine the methods of statistical calculus in the light of high speed computers and to present several special matrix operators which are especially useful. These operators are designed for simplicity and efficiency on high speed computers. Using these operators, we explore many different statistical techniques which are commonly used in educational research and show simple computational formulae. We observe that these statistics are computed more simply from the general linear model than from the usual computing procedures. We conclude that persons in educational research and in other areas need to place much more emphasis on mathematical models and much less emphasis on the special techniques which were developed for desk calculators.The introduction to the thesis discusses the present methods of statistical calculus and their disadvantages as procedures for high speed computers. Efficiency on desk calculators and high speed computers is compared, showing that the many ingenious short‐cut procedures developed for desk calculators are quite inefficient for larger machines. Some other approaches to statistical methods for high speed computers are discussed briefly.The thesis contains a review of the statistical background necessary for using the special matrix operators. The form of data and mathematical models is discussed. We present the assumptions which must be made about the error term which is in the model. We also present a summary of well known formulae for many statistics commonly computed in univariate and multivariate analysis. The formulae are given without proof, but the reader is referred to readily available literature in which the proofs are available.The six special matrix operators are discussed in detail. Their mathematical properties are given in the text of the thesis and the details of computing are given in computer subroutines which are listed in Appendix A. The first operator is SCP (sum cross‐products) which is used to compute the known values of normal equations and the various sums of squares and cross‐products which are needed for statistical analysis. This operator may be considered as a subset of the operations performed in computing the product of a matrix pre‐multiplied by its transpose. The second operator is SWP (sweep) which is used to compute the inverse of a subsection of a matrix and to compute partial sums of squares and cross‐products. This operator may be considered as a subset of the operations used in computing an inverse. The TCM (Transform cross‐products matrix) operator is used to transform the cross‐products matrix in a manner equivalent to performing an arbitrary linear transformation of the variables. The remaining three operators are specific transformations of the cross‐products matrix, each being equivalent to a specific transformation of the variables. The STD (standardize) operator converts the cross‐products matrix in a manner equivalent to transforming the variables to unit length. The MSTD (multi‐standardize) operator is used to transform the cross‐products matrix in a manner equivalent to converting some of the original variables to unit length and an orthogonal orientation. The SDG (step‐diagonalize) operator is used to compute the eigenvalues and eigenvectors of a subset of a cross‐products matrix. This is equivalent to transforming some of the input variables to eigenscores. The SDG operator may be considered as a subset of the operations performed in computing latent roots and vectors by the Jacobi method.Many statistical techniques are redefined using the special matrix operators. The simplest example is the calculation of means and standard deviations which may be computed using the SCP and SWP operators. The testing of hypotheses that the means are zero or some other constant is shown. We also show the computing methods for the standard error of the mean. The correlation coefficient is shown to be a by‐product of computing t tests for testing the means against some hypothesized values.Computing formulae for multiple regression and correlation is presented. Formulae for estimation of regression coefficients are shown for both raw and standard scores. Computing formulae for testing hypotheses about the regression coefficients are also shown. Formulae for partial and multiple correlation are presented. We make suggestions about how to incorporate any step‐wise criteria into regression using the special operators. The problems of singular and near‐singular matrices are discussed and several possible solutions given. A method for the computation of a principal components factor analysis is suggested.The computing formulae for the analysis of variance are the same as those for regression. The analysis of variance may have categorical independent variables which must be converted to dummy variables before the analysis can be computed. A special vector function is presented to convert a nominal scale to dummy variables. Another special vector function is presented to compute dummy variables for interaction. The formulae for computing row, column, and interaction effects are given, as well as the procedures for testing hypotheses. The formulae are not restricted to any specific number of ways of classification or to balanced designs. The problems of missing data in the analysis of variance are discussed and a solution proposed.The analysis of covariance is presented as a mixture of regression and analysis of variance methods. We first present procedures which assume that all cells and margins have a common regression line and then show procedures in which we do not make that assumption. In the second case, we may test whether or not there are row, column, and interaction effects for slopes, intercepts, or entire regression lines. The formulae are quite general, allowing any number of ways of classification and any number of concomitant variables. Balanced designs are not presumed.The procedures for multivariate analysis follow directly from the computation of univariate statistics. First, generalized regression is presented; that is, estimating the regression coefficients for several criteria simultaneously. The testing of hypotheses in multivariate analysis is discussed and several test criteria presented. Canonical analysis and computing formulae for canonical correlation coefficients are presented. We then show the relationships between the computation of a generalized analysis of variance or covariance and the univariate computations. Discriminant functions are presented and shown to be a by‐product of the computation of a generalyzed analysis of variance. Step‐wise procedures for multivariate analysis are suggested.The conclusions of the thesis are in two areas, statistical calculus and statistics in general. The thesis shows that many of the commonly used statistical techniques can be computed by subjecting the data to a sequence of seven operators. Some of the operators are optional and some are trivial if there is only one dependent variable. Nevertheless, these simple steps will compute most of the statistics used in educational research in a manner which is very simple and efficient on high speed computers.The statistical conclusions are largely a matter of emphasis, not of essential changes. The relationships among the many statistical techniques presented in this thesis have been known among statisticians for some time. However, this thesis concludes that the general formulae may be used quite easily on high speed computers (although not on desk calculators). Therefore, we propose that we need much less emphasis on different statistical techniques and much more emphasis on mathematical models and on the transformation of data.

Cite

CITATION STYLE

APA

Beaton, A. E. (1964). THE USE OF SPECIAL MATRIX OPERATORS IN STATISTICAL CALCULUS. ETS Research Bulletin Series, 1964(2). https://doi.org/10.1002/j.2333-8504.1964.tb00689.x

THE USE OF SPECIAL MATRIX OPERATORS IN STATISTICAL CALCULUS

Abstract

Cite

Register to see more suggestions