Sign up & Download
Sign in

Epigenome-wide association studies for common human diseases.

by Vardhman K Rakyan, Thomas A Down, David J Balding, Stephan Beck
Nature Reviews Genetics ()

Abstract

Despite the success of genome-wide association studies (GWASs) in identifying loci associated with common diseases, a substantial proportion of the causality remains unexplained. Recent advances in genomic technologies have placed us in a position to initiate large-scale studies of human disease-associated epigenetic variation, specifically variation in DNA methylation. Such epigenome-wide association studies (EWASs) present novel opportunities but also create new challenges that are not encountered in GWASs. We discuss EWAS design, cohort and sample selections, statistical significance and power, confounding factors and follow-up studies. We also discuss how integration of EWASs with GWASs can help to dissect complex GWAS haplotypes for functional analysis.

Cite this document (BETA)

Available from Nature Reviews Genetics
Page 1
hidden

Epigenome-wide association studie...

Elucidating the genetic and non-genetic determinants of human complex diseases represents one of the prin- cipal challenges of biomedical research. In recent years, genome-wide association studies (GWASs) have uncov- ered 800 SNP associations for more than 150 diseases and other traits1. Although the complete genetic basis is not yet known for any human complex disease, rese- quencing of exomes ��� and ultimately whole genomes ��� holds promise for identifying most of the causal genetic variations. However, there is now increas- ing interest in exploring how non-genetic variation, including epigenetic factors, could influence complex disease aetiology2���4. The epigenome of a cell is highly dynamic, being governed by a complex interplay of genetic and envi- ronmental factors5. Normal cellular function relies on the maintenance of epigenomic homeostasis, which is further highlighted by numerous reported associa- tions between epigenomic perturbations and human diseases, notably cancer 4. However, most studies of such associations to date have been performed either with inadequate genome coverage (for example, tens to hundreds of loci) but adequate sample size, or with cov- erage that is closer to being genome-wide (thousands of loci) but inadequate sample size. Consequently, for any human complex disease, we remain unaware of the proportion of phenotypic variation that is attributable to inter-individual epigenomic variation. This prob- lem can only be elucidated by large-scale, systematic epigenomic equivalents of GWASs ��� epigenome-wide association studies (EWASs), as first proposed in 2008 (REF.��6). At least for DNA methylation (DNAm), tech- nology is now available that is directly comparable in resolution and throughput to the highly successful GWAS chips that allow genotyping of around 500,000 (500K)��SNPs. But how does one conduct an EWAS? In addition to considerations that are common to both GWASs and EWASs (for example, adequate technology and sam- ple size), the design of EWASs has specific considera- tions with respect to sample selection. DNAm patterns are specific to tissues and developmental stages, and they also change over time. Furthermore, EWAS asso- ciations can be causal as well as consequential for the phenotype in question ��� a difference from GWASs that presents considerable challenges. Here, we dis- cuss these considerations in the context of designing and analysing an effective EWAS, keeping in mind that EWASs are likely to evolve, much like GWASs did, as information and experience accumulate. Epigenetic variation and complex disease Types of epigenetic information. Epigenetic informa- tion in mammals can be transmitted in multiple forms5, including mitotically stable DNAm, post-translational modifications of histone proteins and non-coding RNAs (ncRNAs). For DNAm, the predominant form is methylation of cytosines in the context of cytosine��� guanine dinucleotides (CpGs). However, recent results suggest that CpH methylation (where H = C/A/T) *Blizard Institute of Cell and Molecular Science, Barts and The London School of Medicine and Dentistry, Queen Mary, University of London, 4 Newark Street, London E1 2AT, UK. ���Wellcome Trust Cancer Research UK Gurdon Institute and Department of Genetics, University of Cambridge, Tennis Court Road, Cambridge CB2 1QR, UK. ��Genetics Institute, University College London, Darwin Building, Gower Street, London WC1E 6BT, UK. ||UCL Cancer Institute, University College London, 72 Huntley Street, London WC1E 6BT, UK. Correspondence to V.K.R., D.J.B.��and S.B.�� e-mails: v.rakyan@qmul.ac.uk d.balding@ucl.ac.uk s.beck@ucl.ac.uk. doi:10.1038/nrg3000 Published online 12 July 2011 Epigenome-wide association studies for common human diseases Vardhman K.��Rakyan*, Thomas A.��Down���, David J.��Balding�� and Stephan Beck|| Abstract | Despite the success of genome-wide association studies (GWASs) in identifying loci associated with common diseases, a substantial proportion of the causality remains unexplained. Recent advances in genomic technologies have placed us in a position to initiate large-scale studies of human disease-associated epigenetic variation, specifically variation in DNA methylation. Such epigenome-wide association studies (EWASs) present novel opportunities but also create new challenges that are not encountered in GWASs. We discuss EWAS design, cohort and sample selections, statistical significance and power, confounding factors and follow-up studies. We also discuss how integration of EWASs with GWASs can help to dissect complex GWAS haplotypes for functional analysis. STUDY DESIGNS REVIEWS NATURE REVIEWS | GENETICS VOLUME 12 | AUGUST 2011 | 529 �� 2011 Macmillan Publishers Limited. All rights reserved
Page 2
hidden
Box 1 | Definition of features known to vary in DNA methylation This rapidly increasing list of features is not meant to be complete but intends to show the key loci and contexts in which DNA methylation (DNAm) is known to vary. Methylation variable position (MVP). A CpG site that shows differential methylation ��� for example, between different disease states, as illustrated in the figure. Given recent findings on non-CpG methylation, potentially all Cs could be MVPs. Differentially methylated region (DMR). A region of the genome at which multiple adjacent CpG sites show differential methylation. DMRs can occur in many different contexts, such as: ��� iDMR ��� imprinting-specific differentially methylated region ��� tDMR ��� tissue-specific differentially methylated region ��� rDMR ��� reprogramming-specific differentially methylated region ��� cDMR ��� cancer-specific differentially methylated region ��� aDMR ��� ageing-specific differentially methylated region. Variably methylated region (VMR). These regions are defined by increased variability rather than gain or loss of DNAm. Allele-specific methylation (ASM). These are positions or regions that vary in DNAm depending on the parent-of-origin, the presence of a polymorphism or as a result of a stochastic event. Haplotype-specific methylation (HSM). This is a differentially methylated region that is defined by a set of co-inherited SNPs (a haplotype). CpG islands (CGIs). These are regions enriched for CpG sites. Most CGIs are unmethylated in all cell types. CGI shores. These are regions immediately adjacent to CGIs and display higher variation in DNAm than CGIs despite their lower density of CpG sites. The figure shows different types of DNAm variation that can be identified with epigenome-wide association studies. The notation n is used to indicate the variable size of the regions shown. For the purpose of this simplified illustration, the cases and controls are assumed to have methylated or unmethylated CpG states only. Real samples will contain populations of different cells and hence display much more heterogeneous methylation levels across the full dynamic range between 0% and 100%. %R) %R) %R) %R) P %R) %R) %R) P #NNGNG C 7POGVJ[NCVGF /GVJ[NCVGF *CRNQV[RG T A G G #NNGNG D C T A C #NNGNG C G T G T #NNGNG D C C A A #NNGNG C G A C T #NNGNG D G A C T #NNGNG C A T G G #NNGNG D C T T T %QPVT QNU %CUGU /82 &/4 8/4 #5/ *5/ Genome-wide association studies (GWASs). These are genome-wide studies that are designed to identify genetic associations with an observable trait, disease or condition, such as diabetes. Exome The part of a genome that encodes exons for translation into proteins. may be more common than previously appreciated7,8. Catalysed by the ten-eleven translocation (TET) methylcytosine dioxygenases, 5-hydroxymethylation9,10 of cytosines (hmC) is yet another form of DNAm. Although details are still unclear, increasing evidence suggests a role of hmC in gene regulation and differ- entiation11. Histone modifications include, to name but a few, mono-, di- or trimethylation, acetylation and citrullination of one or more amino acids in the amino-terminal tails of core histones5. More recently, it has been discovered that ncRNAs can self-propagate and be transmitted independently of the underlying DNA in other words, they can ���epigenetically��� trans- mit regulatory information12,13. Such ncRNAs include short microRNAs (miRNAs), PIWI-interacting RNAs (piRNAs) and large intergenic non-coding RNAs (lincRNAs), among others12. Epigenetic variation in health and disease. The full range of epigenetic marks is currently unknown but is potentially enormous, considering that the diploid human epigenome contains 108 Cs (of which 107 are CpGs) and 108 histone tails that can all potentially vary. The most studied epigenetic mark is DNAm, and BOX��1 discusses the most common features and contexts in which DNAm varies. DNAm variation at a single CpG site is known as a methylation variable position (MVP), which can be considered as the epigenetic equivalent of a SNP14. Very rarely, CpGs on only one of the two strands of DNA per allele are methylated. This is known as hemimethylation, and it probably reflects post-replication lag in DNAm maintenance in prolif- erating cells. If DNAm is altered at multiple adjacent CpG sites, this is referred to as a differentially methyl- ated region (DMR). DMRs vary considerably in length: REVIEWS 530 | AUGUST 2011 | VOLUME 12 www.nature.com/reviews/genetics �� 2011 Macmillan Publishers Limited. All rights reserved

Readership Statistics

235 Readers on Mendeley
by Discipline
 
 
 
by Academic Status
 
33% Ph.D. Student
 
17% Post Doc
 
12% Researcher (at an Academic Institution)
by Country
 
32% United States
 
14% United Kingdom
 
6% Germany

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Already have an account? Sign in