Attempt to apply partial least squares regression to regional analyses

0Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.

Abstract

This study implements the statistical method of partial least squares regression (PLSR), especially as a substitute for multiple linear regression (MLR), and examines the usefulness and problems associated with applying PLSR to regional analyses. A number of problems have been reported with regard to using MLR as a statistical tool. First, the reliability of the results of MLR analyses decreases due to multicollinearity when MLR is applied to cases including explanatory variables that show high correlations with each other. Second, increasing the number of explanatory variables relative to the number of samples also decreases the reliability of the results of MLR analyses because of overfitting. Therefore, MLR can only be applied to cases with many samples and a few explanatory variables showing low correlations with each other. In PLSR, which is often used in chemometrics, latent variables are calculated to summarize explanatory variables. Latent explanatory variables are calculated to be much fewer than the original ones, to show no correlation with each other, and to have maximum covariance with the explained variable. This means that PLSR analyses can avoid multicollinearity and overfitting and achieve highly reliable prediction. In addition, PLSR can estimate regression coefficients showing the relationship between the original explanatory variables and the explained variable. This study applied MLR and PLSR to a regional analysis of solid waste generation and compared both results. The case study included one explained variable and twenty-nine explanatory variables; the volume of solid waste generated per capita and variables regarding the population, households, establishments, environments, and municipal policies in thirty-three municipalities in Fukui Prefecture (excluding the two outliers of Izumi village and Ooi town), noting that all of the twenty-nine explanatory variables were not always used, especially in the model to which MLR was applied, to avoid multicollinearity. The results showed that PLSR has considerable advantages over MLR. Three regional characteristics were significant in influencing solid waste generation: the characteristic of being urban or rural; differences in municipal policies regarding recyclable waste collection; and characteristics displaying the concentration of service industries such as inns, of which the waste generated was considered to be likely to increase the per capita volume of solid waste generation because it is mixed into municipally collected waste. The PLSR model also found many significant explanatory variables influencing the volume of solid waste generated per capita, which the MLR model did not. However, some problems were discovered. The significance test on regression coefficients in PLSR must be further developed. In addition, statistical developments are needed so that PLSR can deal with spatial dependence and spatial heterogeneity.

Cite

CITATION STYLE

APA

Namie, A. (2007). Attempt to apply partial least squares regression to regional analyses. Geographical Review of Japan, 80(4), 178–191. https://doi.org/10.4157/grj.80.178

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free