A new method based on a combination of stacked interval partial least squares (SIPLS) and sparse partial least squares (SPLS) regression, called stacked interval sparse PLS (SISPLS) regression, is explored. The proposed method is based on splitting spectral data into discrete, equally-spaced intervals, building optimized SPLS models on each region, and weighting the local models based on the cross-validation error achieved during the optimization. The method is highly flexible and only performs explicit variable selection when advantageous; instead the aim is to find favorable rotations of the classical PLS solution while also utilizing local information in a spectrum. The SISPLS model regression vector clearly highlights regional and variable importance in the data, permitting a more straightforward interpretation of the resulting model. For a specific dataset, the optimal interval size is determined via a random sampling of the calibration data and exhaustive testing of the feasible interval sizes. The method is demonstrated on two NIR datasets and a Raman dataset. In addition to the multi-faceted interpretational advantage from the variable selection and weighting, we show that the predictions from the method are competitive with those from PLS, SPLS, SIPLS, and VIP selection.
Poerio, D. V., & Brown, S. D. (2017). Stacked interval sparse partial least squares regression analysis. Chemometrics and Intelligent Laboratory Systems, 166, 49–60. https://doi.org/10.1016/j.chemolab.2017.03.006