Large sample size and nonlinear sparse models outline epistatic effects in inflammatory bowel disease

7Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Background: Despite clear evidence of nonlinear interactions in the molecular architecture of polygenic diseases, linear models have so far appeared optimal in genotype-to-phenotype modeling. A key bottleneck for such modeling is that genetic data intrinsically suffers from underdetermination (p≫ n). Millions of variants are present in each individual while the collection of large, homogeneous cohorts is hindered by phenotype incidence, sequencing cost, and batch effects. Results: We demonstrate that when we provide enough training data and control the complexity of nonlinear models, a neural network outperforms additive approaches in whole exome sequencing-based inflammatory bowel disease case–control prediction. To do so, we propose a biologically meaningful sparsified neural network architecture, providing empirical evidence for positive and negative epistatic effects present in the inflammatory bowel disease pathogenesis. Conclusions: In this paper, we show that underdetermination is likely a major driver for the apparent optimality of additive modeling in clinical genetics today.

Cite

CITATION STYLE

APA

Verplaetse, N., Passemiers, A., Arany, A., Moreau, Y., & Raimondi, D. (2023). Large sample size and nonlinear sparse models outline epistatic effects in inflammatory bowel disease. Genome Biology, 24(1). https://doi.org/10.1186/s13059-023-03064-y

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free