Motivation: The pre-estimate of the proportion of null hypotheses (π0) plays a critical role in controlling false discovery rate (FDR) in multiple hypothesis testing. However, hidden complex dependence structures of many genomics datasets distort the distribution of p-values, rendering existing π0 estimators less effective. Results: From the basic non-linear model of the q-value method, we developed a simple linear algorithm to probe local dependence blocks. We uncovered a non-static relationship between tests' p-values and their corresponding q-values that is influenced by data structure and π0. Using an optimization framework, these findings were exploited to devise a Sliding Linear Model (SLIM) to more reliably estimate π0 under dependence. When tested on a number of simulation datasets with varying data dependence structures and on microarray data, SLIM was found to be robust in estimating π0 against dependence. The accuracy of its π0 estimation suggests that SLIM can be used as a stand-alone tool for prediction of significant tests. © The Author 2010. Published by Oxford University Press. All rights reserved.
CITATION STYLE
Wang, H. Q., Tuominen, L. K., & Tsai, C. J. (2011). SLIM: A sliding linear model for estimating the proportion of true null hypotheses in datasets with dependence structures. Bioinformatics, 27(2), 225–231. https://doi.org/10.1093/bioinformatics/btq650
Mendeley helps you to discover research relevant for your work.