Objectives: Few interactions between risk factors for schizophrenia have been replicated, but fitting all such interactions is difficult due to high-dimensionality. Our aims are to examine significant main and interaction effects for schizophrenia and the performance of our approach using simulated data. Methods: We apply the machine learning technique elastic net to a high-dimensional logistic regression model to produce a sparse set of predictors, and then assess the significance of odds ratios (OR) with Bonferroni-corrected p-values and confidence intervals (CI). We introduce a simulation model that resembles a Finnish nested case–control study of schizophrenia which uses national registers to identify cases (n = 1,468) and controls (n = 2,975). The predictors include nine sociodemographic factors and all interactions (31 predictors). Results: In the simulation, interactions with OR = 3 and prevalence = 4% were identified with <5% false positive rate and ≥80% power. None of the studied interactions were significantly associated with schizophrenia, but main effects of parental psychosis (OR = 5.2, CI 2.9–9.7; p
CITATION STYLE
Gyllenberg, D., McKeague, I. W., Sourander, A., & Brown, A. S. (2020). Robust data-driven identification of risk factors and their interactions: A simulation and a study of parental and demographic risk factors for schizophrenia. International Journal of Methods in Psychiatric Research, 29(4), 1–11. https://doi.org/10.1002/mpr.1834
Mendeley helps you to discover research relevant for your work.