Intensity-based analysis of dual-color gene expression data as an alternative to ratio-based analysis to enhance reproducibility

17Citations
Citations of this article
30Readers
Mendeley users who have this article in their library.

Abstract

Background: Ratio-based analysis is the current standard for the analysis of dual-color microarray data. Indeed, this method provides a powerful means to account for potential technical variations such as differences in background signal, spot size and spot concentration. However, current high density dual-color array platforms are of very high quality, and inter-array variance has become much less pronounced. We therefore raised the question whether it is feasible to use an intensity-based analysis rather than ratio-based analysis of dual-color microarray datasets. Furthermore, we compared performance of both ratio- and intensity-based analyses in terms of reproducibility and sensitivity for differential gene expression.Results: By analyzing three distinct and technically replicated datasets with either ratio- or intensity-based models, we determined that, when applied to the same dataset, intensity-based analysis of dual-color gene expression experiments yields 1) more reproducible results, and 2) is more sensitive in the detection of differentially expressed genes. These effects were most pronounced in experiments with large biological variation and complex hybridization designs. Furthermore, a power analysis revealed that for direct two-group comparisons above a certain sample size, ratio-based models have higher power, although the difference with intensity-based models is very small.Conclusions: Intensity-based analysis of dual-color datasets results in more reproducible results and increased sensitivity in the detection of differential gene expression than the analysis of the same dataset with ratio-based analysis. Complex dual-color setups such as interwoven loop designs benefit most from ignoring the array factor. The applicability of our approach to array platforms other than dual-color needs to be further investigated. © 2010 Bossers et al; licensee BioMed Central Ltd.

Figures

  • Figure 1 Intensities of the same sample measured on separate arrays are highly correlated. Hierarchical clustering of log2-transformed single channel intensities of the complete cell line experiment. Only genes with an average intensity A > 7 were used. Note that identical cell line-treatment combinations always cluster together, regardless of the co-hybridized sample. Sample naming = [cell line] [treatment] [duplicate set].
  • Figure 2 Comparison of effect sizes for array and treatment factors. Comparison between the relative sizes of the array and treatment effects, derived from the ANOVA model. Panel A: cell line dataset. Panel B: MAQC dataset. Dashed line: smoothed histogram over all genes for treatment effect size (absolute value of M-value), averaged over all treatment comparisons. Solid line: smoothed histogram over all genes for average array effect size (absolute value of M-value). Note that both the treatment and array effects still include an unavoidable noise component, hence one expects a partial overlap in the histograms because of genes that do not show a differential effect between treatments. Still there is a clear proportion of genes for which the mean treatment effect is much larger than the array effect size.
  • Figure 3 Ratio- and intensity-based analysis results in similar sets of differentially expressed genes. For the cell line dataset C1, p-values generated by the ratio and intensity ANOVA models were ranked from low to high, and assigned to bins containing 1000 genes. The fraction of overlap represents the proportion of genes occurring in both sets.
  • Figure 4 Intensity models provide more reproducible results than ratio models. A, B) Comparison between the reproducibility of p-values between technically duplicated experiments, generated by the ratio model (A) and the intensity model (B). Note the higher correlation for the intensity model. p-values are given as -log10(p-value): higher p-values are more significant. C) proportion of genes reproduced by either the ratio on intensity model, for sets of equally ranked genes between the replicate datasets C1 and C2.
  • Table 1 The intensity model is favored over the ratio model based on BIC model selection.
  • Figure 5 Single channel clustering of human brain dataset. Hierarchical clustering of log2-transformed single channel intensities of the human brain experiment. Only genes with an average intensity A > 7 were used. Note that, for all 49 individuals, the two replicate measurements cluster together.
  • Figure 6 Comparison between ratio and intensity model-based reproducibility in the brain dataset. A, B) Comparison between the reproducibility of p-values between the split brain datasets B1 and B2, generated by the ratio model (A) and the intensity model (B). Note the absence of correlation between p-values calculated with the ratio model. p-values are given as -log10(p-value): higher p-values are more significant. C) proportion of genes reproduced in sets of equally ranked genes between the replicate datasets.
  • Figure 7 Reproducibility of between-group treatment effects based on ratio and intensity models. Reproducibility of ANOVA-derived treatment effects between group 0 and group 6 in replicate brain datasets B1 and B2. Panel A: reproducibility of treatment effects derived from the ratio model. Panel B: reproducibility of treatment effects derived from the intensity model. Note the enhanced reproducibility when using the intensity-based ANOVA model.

References Powered by Scopus

Neuropathological stageing of Alzheimer-related changes

13155Citations
N/AReaders
Get full text

A comparison of normalization methods for high density oligonucleotide array data based on variance and bias

6753Citations
N/AReaders
Get full text

Normalization of cDNA microarray data

1525Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Isolation of glia from Alzheimer's mice reveals inflammation anddysfunction

295Citations
N/AReaders
Get full text

Acute isolation and transcriptome characterization of cortical astrocytes and microglia from young and aged mice

207Citations
N/AReaders
Get full text

Phenotypic Characterization of Retinoic Acid Differentiated SH-SY5Y Cells by Transcriptional Profiling

196Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Bossers, K., Ylstra, B., Brakenhoff, R. H., Smeets, S. J., Verhaagen, J., & van de Wiel, M. A. (2010). Intensity-based analysis of dual-color gene expression data as an alternative to ratio-based analysis to enhance reproducibility. BMC Genomics, 11(1). https://doi.org/10.1186/1471-2164-11-112

Readers over time

‘10‘11‘12‘13‘14‘15‘16‘17‘18‘20‘2202468

Readers' Seniority

Tooltip

Researcher 13

50%

Professor / Associate Prof. 7

27%

PhD / Post grad / Masters / Doc 6

23%

Readers' Discipline

Tooltip

Agricultural and Biological Sciences 18

75%

Biochemistry, Genetics and Molecular Bi... 4

17%

Computer Science 1

4%

Engineering 1

4%

Save time finding and organizing research with Mendeley

Sign up for free
0