In studies involving genetic data, the correlations between X and Y scores obtained from PLS regression models can be used as measures of association between genome-level measurements, X, and phenotype-level measurements, Y. These correlations may be overestimated due to potential overfitting (i.e., they may be vulnerable to optimism bias).We evaluate the optimism bias through simulations and examine the effect of increasing sample size and strength of correlation. We assess the effectiveness of bootstrap-based and permutation-based bias correction methods.We also investigate the selection of the appropriate number of components for PLS regression. We include an analysis of genetic data consisting of genotypes and phenotypes related to Attention Deficit Hyperactivity Disorder (ADHD).
CITATION STYLE
Cunningham, E., Ciampi, A., Joober, R., & Labbe, A. (2016). Estimating and correcting optimism bias in multivariate PLS regression: Application to the study of the association between single nucleotide polymorphisms and multivariate traits in attention deficit hyperactivity disorder. In Springer Proceedings in Mathematics and Statistics (Vol. 173, pp. 103–113). Springer New York LLC. https://doi.org/10.1007/978-3-319-40643-5_8
Mendeley helps you to discover research relevant for your work.