Evaluation of DNA microarray resu...
NATURE BIOTECHNOLOGY VOLUME 24 NUMBER 9 SEPTEMBER 2006 1115 Evaluation of DNA microarray results with quantitative gene expression platforms Roger D Canales1,10, Yuling Luo2,10, James C Willey3,10, Bradley Austermiller3, Catalin C Barbacioru1, Cecilie Boysen4, Kathryn Hunkapiller1, Roderick V Jensen5, Charles R Knight6, Kathleen Y Lee1, Yunqing Ma2, Botoul Maqsodi2, Adam Papallo5, Elizabeth Herness Peters6, Karen Poulter1, Patricia L Ruppel7, Raymond R Samaha1, Leming Shi8, Wen Yang2, Lu Zhang1 & Federico M Goodsaid9 We have evaluated the performance characteristics of three quantitative gene expression technologies and correlated their expression measurements to those of five commercial microarray platforms, based on the MicroArray Quality Control (MAQC) data set. The limit of detection, assay range, precision, accuracy and fold-change correlations were assessed for 997 TaqMan Gene Expression Assays, 205 Standardized RT (Sta)RT-PCR assays and 244 QuantiGene assays. TaqMan is a registered trademark of Roche Molecular Systems, Inc. We observed high correlation between quantitative gene expression values and microarray platform results and found few discordant measurements among all platforms. The main cause of variability was differences in probe sequence and thus target location. A second source of variability was the limited and variable sensitivity of the different microarray platforms for detecting weakly expressed genes, which affected interplatform and intersite reproducibility of differentially expressed genes. From this analysis, we conclude that the MAQC microarray data set has been validated by alternative quantitative gene expression platforms thus supporting the use of microarray platforms for the quantitative characterization of gene expression. To evaluate performance characteristics of gene expression measure- ment technologies and the data they generate, one must identify alterna- tive quantitative platforms that can be used as references. The MAQC consortium used the TaqMan assays, Standardized (Sta)RT-PCR and QuantiGene platforms for this purpose because these platforms had been shown to have high assay specificity and detection sensitivity, broad linear dynamic range and high signal-to-analyte response1���4. The plat- forms were used to evaluate some of these performance characteristics in each commercial whole genome microarray platform investigated in the MAQC study. In addition, we report the fold-change correlation of each alternative quantitative platform relative to these microarray plat- forms. We observed high correlations between the quantitative platform measurements and the data derived from the microarrays and were also able to identify the sources of variability among microarray platforms relative to the quantitative platforms. Here we define validation as a measure of the concordance and discor- dance of the microarray data with the quantitative reference platforms selected���we used the results of the quantitative platforms as a reference against which to evaluate the microarray platforms. We have thus not attempted to establish a ���gold standard��� for expression measurements but a solid reference point to allow data validation. Quantitative, real-time PCR has been developed over the last decade to specifically measure template molecule numbers4,5. The development of fluorogenic probes6 enabled accurate quantification of PCR products through measurement of a fluorescence signal during the exponential amplification phase. TaqMan Gene Expression Assays are based on the use of the 5��� nuclease activity of Taq polymerase to hydrolyze a target- specific, dual-labeled, fluorogenic hybridization probe during the exten- sion phase7. The number of template transcript molecules in a sample is determined by recording the amplification cycle in the exponential phase (cycle threshold or CT), at which time the fluorescence signal can be detected above background fluorescence. Thus, the starting number of template transcript molecules is inversely related to CT���the more tem- plate transcript molecules at the beginning, the lower the CT7,8. TaqMan assays have been used in recent studies to validate microarray data9���11. StaRT-PCR4,12 is a competitive PCR-based platform that enables endpoint quantification of PCR products. After RNA is converted to cDNA, the cDNA is added to a standardized mixture of internal stan- dard (SMIS) competitive templates, aliquoted into microplate wells containing gene-specific PCR primers and amplified for 35 cycles. The individual endpoint StaRT-PCR products are then separated by size and quantified by high-throughput microfluidic electrophoresis. StaRT-PCR has also been used in studies to validate microarray data1 and has been used to generate potential biomarkers for disease stratification13,14. 1Applied Biosystems, 850 Lincoln Centre Dr., Foster City, California 94404, USA. 2Panomics, Inc., 6519 Dumbarton Circle, Fremont, California 94555, USA. 3University of Toledo, Toledo, Ohio 43614, USA. 4ViaLogy Corp., 2400 Lincoln Avenue, Altadena, California 91001, USA. 5University of Massachusetts-Boston, 100 Morrissey Blvd., Boston, Massachusetts 02125, USA. 6Gene Express, Inc., 975 Research Drive, Toledo, Ohio 43614, USA. 7Innovative Analytics, 7107 Elm Valley Dr., Kalamazoo, Michigan 49009, USA. 8National Center for Toxicological Research, US Food and Drug Administration, 3900 NCTR Rd., Jefferson, Arkansas 72079, USA. 9Center for Drug Evaluation and Research, US Food and Drug Administration, 10903 New Hampshire Ave., Silver Spring, Maryland 20993, USA. 10These authors contributed equally to this work. Correspondence should be addressed to F.M.G. (Federico.Goodsaid@fda.hhs.gov). Published online 8 September 2006 doi:10.1038/nbt1236 A N A LY S I S �� 200 6 Nature Publishing Group http://www.nature.com/naturebiotechnology
1116 VOLUME 24 NUMBER 9 SEPTEMBER 2006 NATURE BIOTECHNOLOGY The QuantiGene Reagent System15 detects DNA and RNA directly without a reverse transcription step. It is a sandwich nucleic acid hybrid- ization platform in which targets are captured through cooperative hybridization of multiple probes16. This complex is detected through signal amplification by a branched DNA amplifier and chemilumines- cence signal generation. The QuantiGene assay has been used in US Food and Drug Administration���approved clinical diagnostic products for quantitative viral load determination of HIV, hepatitis C virus and hepa- titis B virus with detection sensitivity of 50 transcript molecules17���19. Because the QuantiGene assay can measure gene expression either by measuring RNA directly without a reverse transcription step, or by mea- suring cDNA without PCR amplification, it provides an independent method of measurement relative to the quantitative reverse transcription (RT)-PCR and microarray platforms. Application of these quantitative platforms in the MAQC project increased the confidence in concordance observed between the micro- array platforms. In addition, the results obtained from using these plat- forms allowed us to explore the sources of variability among microarray platforms. With this comprehensive evaluation, we demonstrate the value of alternative quantitative platforms as tools for the independent validation of microarray data and the resolution of discordant results. RESULTS Assay performance of three alternative quantitative platforms The MAQC consortium selected a list of 1,297 genes to evaluate and compare the performance of microarray and alternative quantitative platforms and to identify and analyze discordant results. TaqMan assays, StaRT-PCR and QuantiGene assays were performed on 997, 205 and 244 of the 1,297 genes, respectively. Gene lists used for analysis of selected performance metrics for quantitative platforms, and for analysis of con- cordance between the quantitative platforms and microarrays are shown in Supplementary Table 1 online. Four RNA samples A, B, C and D, provided by the MAQC consor- tium, were analyzed20. TaqMan assays were done in quadruplicate, and StaRT-PCR assays in triplicate, on cDNA generated from 10 ng total RNA (Supplementary Methods online). Both the TaqMan assays and StaRT-PCR were based on cDNA from a single reverse transcription reaction. QuantiGene assays were performed in triplicate directly from 500 ng of total RNA (Table 1). Performance metrics presented are not directly comparable because each platform assayed a different gene set, and had different assay ranges of measurements and signal-to-analyte response. Detection sensitivity TaqMan assay quantification is directly related to CT. A gene is not detectable when the average CT 35 cycles. By this definition, 857 genes (86%) were detectable in both A and B. The StaRT-PCR detection limit is defined as ten transcript molecules. By this definition, 193 genes (94%) were detectable in both A and B. For QuantiGene the detection limit is defined as a signal three standard deviations (s.d.) above the back- ground. By this standard, 223 genes (91.4%) were detectable in both A and B. Assay range The assay range represents the difference in signals measured on a log10 scale between genes with the highest and the lowest expression. The assay range for TaqMan assays was 8.1 with CT values ranging from 8 (108 transcript molecules) for 18S rRNA to 35 (~5 transcript molecules) for low expressors. For StaRT-PCR, the assay range was 6.8 with nor- malized transcripts of 6.4 �� 107 transcript molecules for 18S rRNA to 10 transcript molecules for low expressors. For QuantiGene, the assay range was 4.1 with the highest assay range of 599 relative luminescence units (RLU) for LDHA and the lowest detectable signal of 0.045 RLU for SPARCL1. Precision The precision of the three alternative quantitative platforms was measured by coefficient of variance (CV) (Fig. 1 and Table 1) or s.d. (Supplementary Fig. 1 online). There were interplatform differences in the number of transcript molecules (RNA or cDNA) loaded into each assay. Because of differences in the amount of sample loaded (Table 1), a majority of the genes measured with QuantiGene contained 6,000 transcript molecules in the assay, whereas a majority of those measured by TaqMan assays and StaRT-PCR had less. These two platforms were used to assess the previously reported stochastic process involved in the relationship between transcript molecules loaded and CV21. A clear trend of increased CV with decreasing abundance of transcripts was observed for TaqMan assays and StaRT-PCR when 6,000 transcript Table 1 Summary of platform performance metrics Platform Gene list Sample processing Detection sensitivitya Dynamic rangeb (log10) Precisionc (median) Accuracyd (median) Symbol Number of genes tested Sample input Assay replicates Data presentation Both A & B above LOD Both A & B below LOD All data 6,000 Linearitye (R2) RA (%median)f RA (%variance)g TAQ 997 cDNA from 10 ng total RNA, one RT reaction Four replicates of cDNA Normalized against POLR2A 857 (86%) 38 (3.8%) 8.1 3.46 2.42 0.950 3.6 9.4 GEX 205 cDNA from 10 ng total RNA, one RT reaction Three replicates of cDNA Normalized against beta-actin 193 (94%) 4 (2.0%) 6.8 6.26 3.82 0.96h 0.4h 21.1h QGN 244 500 ng total RNA Three replicate of RNA directly Original data 223 (91%) 5 (2.0%) 4.1 2.16 2.12 0.994 1.0 5.0 aDetection sensitivity: the number (percent) of detectable or undetectable genes in both sample A&B based on each platform���s detection limit. bAssay range: based on the ratio of (highest detect- able signal/lowest detectable signal) of all the genes and samples measured in each platform. cPrecision: based on median value of CV measured either a) in all genes and all samples in each platform or b) in samples with 6,000 transcript molecules or above. dBased on formula C = 0.25A +0.75B and D = 0.75A + 0.25B for TaqMan assays and QuantiGene and C = 0.88A + 0.12B and D = 0.45A + 0.55B for StaRT-PCR. eLinearity: based on the median R2 slope of the linear fit of assay signal from sample A, B, C, D for all the detectable genes with greater than twofold dif- ference between A and B. 829, 125 and 223 genes are analyzed for TaqMan, StaRT-PCR, and QuantiGene, respectively. fRA score (% median): RA (relative accuracy) score for sample C and D for a gene is defined as (C-C���/C���) and (D-D���/D���), which represents the percent difference of experimental from the expected. Median value of % RA score for both sample C and D combined is pre- sented here. Only detectable genes in both A & B are analyzed for each platform. gRA score (% variance): median value of the absolute RA scores for both sample C and D combined is presented here. hBased on a recalibrated data set (Supplementary Methods). ANALYSIS �� 200 6 Nature Publishing Group http://www.nature.com/naturebiotechnology