Sign up & Download
Sign in

Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes

by Jo Vandesompele, Katleen De Preter, Filip Pattyn, Bruce Poppe, Nadine Van Roy, Anne De Paepe, Frank Speleman
Genome Biology ()

Abstract

Using real-time reverse transcription PCR ten housekeeping genes from different abundance and functional classes in various human tissues were evaluated. The conventional use of a single gene for normalization leads to relatively large errors in a significant proportion of samples tested.

Cite this document (BETA)

Available from Jo Vandesompele's profile on Mendeley.
Page 1
hidden

Accurate normalization of real-ti...

http://genomebiology.com/2002/3/7/research/0034.1 comment reviews reports deposited research interactions information refereed research Research Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes Jo Vandesompele, Katleen De Preter, ilip Pattyn, Bruce Poppe, Nadine Van Roy, Anne De Paepe and rank Speleman Address: Center for Medical Genetics, Ghent University Hospital 1K5, De Pintelaan 185, B-9000 Ghent, Belgium. Correspondence: rank Speleman. E-mail: franki.speleman@rug.ac.be Abstract Background: Gene-expression analysis is increasingly important in biological research, with real- time reverse transcription PCR (RT-PCR) becoming the method of choice for high-throughput and accurate expression profiling of selected genes. Given the increased sensitivity, reproducibility and large dynamic range of this methodology, the requirements for a proper internal control gene for normalization have become increasingly stringent. Although housekeeping gene expression has been reported to vary considerably, no systematic survey has properly determined the errors related to the common practice of using only one control gene, nor presented an adequate way of working around this problem. Results: We outline a robust and innovative strategy to identify the most stably expressed control genes in a given set of tissues, and to determine the minimum number of genes required to calculate a reliable normalization factor. We have evaluated ten housekeeping genes from different abundance and functional classes in various human tissues, and demonstrated that the conventional use of a single gene for normalization leads to relatively large errors in a significant proportion of samples tested. The geometric mean of multiple carefully selected housekeeping genes was validated as an accurate normalization factor by analyzing publicly available microarray data. Conclusions: The normalization strategy presented here is a prerequisite for accurate RT-PCR expression profiling, which, among other things, opens up the possibility of studying the biological relevance of small expression differences. Published: 18 June 2002 Genome Biology 2002, 3(7):research0034.1���0034.11 The electronic version of this article is the complete one and can be found online at http://genomebiology.com/2002/3/7/research/0034 �� 2002 Vandesompele et al., licensee BioMed Central Ltd (Print ISSN 1465-6906 Online ISSN 1465-6914) Received: 20 December 2001 Revised: 10 April 2002 Accepted: 7 May 2002 Background Gene-expression analysis is increasingly important in many fields of biological research. Understanding patterns of expressed genes is expected to provide insight into complex regulatory networks and will most probably lead to the iden- tification of genes relevant to new biological processes, or implicated in disease. Two recently developed methods to measure transcript abundance have gained much popularity and are frequently applied. Microarrays allow the parallel analysis of thousands of genes in two differentially labeled RNA populations [1], while real-time RT-PCR provides the simultaneous measurement of gene expression in many dif- ferent samples for a limited number of genes, and is espe- cially suitable when only a small number of cells are available [2-4]. Both techniques have the advantage of speed, throughput and a high degree of potential automation compared to conventional quantification methods, such as northern-blot analysis, ribonuclease protection assay, or
Page 2
hidden
2 Genome Biology Vol 3 No 7 Vandesompele et al. competitive RT-PCR. Nevertheless, these new approaches require the same kind of normalization as the traditional methods of mRNA quantification. Several variables need to be controlled for in gene-expres- sion analysis, such as the amount of starting material, enzy- matic efficiencies, and differences between tissues or cells in overall transcriptional activity. Various strategies have been applied to normalize these variations. Under controlled con- ditions of reproducible extraction of good-quality RNA, the gene transcript number is ideally standardized to the number of cells, but accurate enumeration of cells is often precluded, for example when starting with solid tissue. Another frequently applied normalization scalar is the RNA mass quantity, especially in northern blot analysis. There are several arguments against the use of mass quantity. The quality of RNA and related efficiency of the enzymatic reac- tions are not taken into account. Moreover, in some instances it is impossible to quantify this parameter, for example, when only minimal amounts of RNA are available from microdissected tissues. Probably the strongest argu- ment against the use of total RNA mass for normalization is the fact that it consists predominantly of rRNA molecules, and is not always representative of the mRNA fraction. This was recently evidenced by a significant imbalance between rRNA and mRNA content in approximately 7.5% of mammary adenocarcinomas [5]. Also, it has been reported that rRNA transcription is affected by biological factors and drugs [6-8]. urther drawbacks to the use of 18S or 28S rRNA molecules as standards are their absence in purified mRNA samples, and their high abundance compared to target mRNA transcripts. The latter makes it difficult to accurately subtract the baseline value in real-time RT-PCR data analysis. To date, internal control genes are most frequently used to normalize the mRNA fraction. This internal control - often referred to as a housekeeping gene - should not vary in the tissues or cells under investigation, or in response to experi- mental treatment. However, many studies make use of these constitutively expressed control genes without proper vali- dation of their presumed stability of expression. But the lit- erature shows that housekeeping gene expression - although occasionally constant in a given cell type or experimental condition - can vary considerably (reviewed in [9-12]). With the increased sensitivity, reproducibility and large dynamic range of real-time RT-PCR methods, the requirements for a proper internal control gene have become increasingly strin- gent. In this study, we carried out an extensive evaluation of 10 commonly used housekeeping genes in 13 different human tissues, and outlined a procedure for calculating a normalization factor based on multiple control genes for more accurate and reliable normalization of gene-expression data. urthermore, this normalization factor was validated in a comparative study with frequently applied microarray scaling factors using publicly available microarray data. Results Expression profiling of housekeeping genes Primers were designed for ten commonly used housekeeping genes (ACTB, B2M, GAPD, HMBS, HPRT1, RPL13A, SDHA, TBP, UBC and YWHAZ) (see Table 1 for full gene name, accession number, function, chromosomal localization, alias, existence of processed pseudogenes, and indication that primers span an intron see Table 2 for primer sequences). Special attention was paid to selecting genes that belong to different functional classes, which significantly reduces the chance that genes might be co-regulated. The expression level of these 10 internal control genes was determined in 34 neuroblastoma cell lines (independently prepared in differ- ent labs from different patients), 20 short-term cultured normal fibroblast samples from different individuals, 13 normal leukocyte samples, 9 normal bone-marrow samples, and 9 additional normal human tissues from pooled organs (heart, brain, fetal brain, lung, trachea, kidney, mammary gland, small intestine and uterus). The raw expression values are available as a tab-delimited file (see Additional data files). Single control normalization error To determine the possible errors related to the common practice of using only one housekeeping gene for normaliza- tion, we calculated the ratio of the ratios of two control genes in two different samples (from the same tissue panel) and termed it the single control normalization error, E (see Materials and methods). or two ideal internal control genes (constant ratio between the genes in all samples), E equals 1. In practice, observed E values are larger than 1 and consti- tute the erroneous E-fold expression difference between two samples, depending on the particular housekeeping gene used for normalization. E values were calculated for all 45 two-by-two combinations of control genes and 865 two-by- two sample combinations within the available tissue panels (neuroblastoma, fibroblast, leukocyte, bone marrow and a series of normal tissues from Clontech that is, a total of 38,925 data points) (igure 1). In addition, the systematic error distribution was calculated by analysis of repeated runs of the same control gene. The average 75th and 90th percentile E values are 3.0 (range 2.1-3.9), and 6.4 (range 3.0-10.9), respectively. Gene-stability measure and ranking of selected housekeeping genes It is generally accepted that gene-expression levels should be normalized by a carefully selected stable internal control gene. However, to validate the presumed stable expression of a given control gene, prior knowledge of a reliable measure to normalize this gene in order to remove any nonspecific varia- tion is required. To address this circular problem, we devel- oped a gene-stability measure to determine the expression stability of control genes on the basis of non-normalized expression levels. This measure relies on the principle that the expression ratio of two ideal internal control genes is

Authors on Mendeley

Readership Statistics

1081 Readers on Mendeley
by Discipline
 
 
 
by Academic Status
 
37% Ph.D. Student
 
14% Post Doc
 
11% Student (Master)
by Country
 
17% United States
 
11% United Kingdom
 
10% Germany

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Already have an account? Sign in