Large-scale data, where the sample size and the dimension are high, often exhibits heterogeneity. This can arise for example in the form of unknown subgroups or clusters, batch effects or contaminated samples. Ignoring these issues would often lead to poor prediction and estimation.We advocate the maximin effects framework (Meinshausen and Bühlmann, Maximin effects in inhomogeneous large-scale data. Preprint arXiv:1406.0596, 2014) to address the problem of heterogeneous data. In combination with partial least squares (PLS) regression, we obtain a new PLS procedure which is robust and tailored for large-scale heterogeneous data. A small empirical study complements our exposition of new PLS methodology.
CITATION STYLE
Bühlmann, P. (2016). Partial least squares for heterogeneous data. In Springer Proceedings in Mathematics and Statistics (Vol. 173, pp. 3–15). Springer New York LLC. https://doi.org/10.1007/978-3-319-40643-5_1
Mendeley helps you to discover research relevant for your work.