Microarray experiments often yield a normal data matrix X whose rows correspond to genes and columns to samples. We commonly calculate test statistics Z = Xw, where Zi is a test statistic for the ith gene, and apply false discovery rate (FDR) controlling methods to find interesting genes. For example, Z could measure the difference in expression levels between treatment and control groups and we could seek differentially expressed genes. The empirical cdf of Z is important for FDR methods, since its mean and variance determine the bias and variance of FDR estimates. Efron (2009b) has shown that if the columns of X are independent, the variance of the empirical cdf of Z only depends on the mean-squared row correlation. Microarray data, however, frequently shows signs of column dependence. In this paper, we show that Efron’s result still holds under column dependence, and give a conservative (upwardly biased) estimator for the mean-squared row correlation. We show Fisher’s transformation for sample correlations is still normalizing and variance stabilizing under column dependence, and use it to construct a permutation-invariant test of column independence. Finally, we argue that estimating the mean-squared row correlation under column dependence is impossible in general. Code to perform our test is available in the R package “colcor,” available on CRAN. © 2010, Institute of Mathematical Statistics. All rights reserved.
CITATION STYLE
Muralidharan, O. (2010). Detecting column dependence when rows are correlated and estimating the strength of the row correlation. Electronic Journal of Statistics, 4, 1527–1546. https://doi.org/10.1214/10-EJS592
Mendeley helps you to discover research relevant for your work.