Integration of datasets for individual prediction of DNA methylation-based biomarkers

0Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Background: Epigenetic scores (EpiScores) can provide biomarkers of lifestyle and disease risk. Projecting new datasets onto a reference panel is challenging due to separation of technical and biological variation with array data. Normalisation can standardise data distributions but may also remove population-level biological variation. Results: We compare two birth cohorts (Lothian Birth Cohorts of 1921 and 1936 — nLBC1921 = 387 and nLBC1936 = 498) with blood-based DNA methylation assessed at the same chronological age (79 years) and processed in the same lab but in different years and experimental batches. We examine the effect of 16 normalisation methods on a novel BMI EpiScore (trained in an external cohort, n = 18,413), and Horvath’s pan-tissue DNA methylation age, when the cohorts are normalised separately and together. The BMI EpiScore explains a maximum variance of R 2=24.5% in BMI in LBC1936 (SWAN normalisation). Although there are cross-cohort R 2 differences, the normalisation method makes a minimal difference to within-cohort estimates. Conversely, a range of absolute differences are seen for individual-level EpiScore estimates for BMI and age when cohorts are normalised separately versus together. While within-array methods result in identical EpiScores whether a cohort is normalised on its own or together with the second dataset, a range of differences is observed for between-array methods. Conclusions: Normalisation methods returning similar EpiScores, whether cohorts are analysed separately or together, will minimise technical variation when projecting new data onto a reference panel. These methods are important for cases where raw data is unavailable and joint normalisation of cohorts is computationally expensive.

Author supplied keywords

Cite

CITATION STYLE

APA

Merzbacher, C., Ryan, B., Goldsborough, T., Hillary, R. F., Campbell, A., Murphy, L., … Marioni, R. E. (2023). Integration of datasets for individual prediction of DNA methylation-based biomarkers. Genome Biology, 24(1). https://doi.org/10.1186/s13059-023-03114-5

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free