A general method for accurate estimation of false discovery rates in identification of differentially expressed genes

23Citations
Citations of this article
40Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

The 'omic' data such as genomic data, transcriptomic data, proteomic data and single nucleotide polymorphism data have been rapidly growing. The omic data are large-scale and high-throughput data. Such data challenge traditional statistical methodologies and require multiple tests. Several multiple-testing procedures such as Bonferroni procedure, Benjamini-Hochberg (BH) procedure and Westfall-Young procedure have been developed, among which some control family-wise error rate and the others control false discovery rate (FDR). These procedures are valid in some cases and cannot be applied to all types of large-scale data. To address this statistically challenging problem in the analysis of the omic data, we propose a general method for generating a set of multiple-testing procedures. This method is based on the BH theorems. By choosing a C-value, one can realize a specific multiple-testing procedure. For example, by setting C = 1.22, our method produces the BH procedure. With C < 1.22, our method generates procedures of weakly controlling FDR, and with C > 1.22, the procedures strongly control FDR. Those with C = G (number of genes or tests) and C = 0 are, respectively, the Bonferroni procedure and the single-testing procedure. These are the two extreme procedures in this family. To let one choose an appropriate multiple-testing procedure in practice, we develop an algorithm by which FDR can be correctly and reliably estimated. Simulated results show that our method works well for an accurate estimation of FDR in various scenarios, and we illustrate the applications of our method with three real datasets. © 2014 The Author 2014.

References Powered by Scopus

Significance analysis of microarrays applied to the ionizing radiation response

10036Citations
N/AReaders
Get full text

Linear models and empirical bayes methods for assessing differential expression in microarray experiments

9662Citations
N/AReaders
Get full text

Statistical significance for genomewide studies

7712Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Systematic identification of lincRNA-based prognostic biomarkers by integrating lincRNA expression and copy number variation in lung adenocarcinoma

63Citations
N/AReaders
Get full text

Photosynthetic activity influences cellulose biosynthesis and phosphorylation of proteins involved therein in Arabidopsis leaves

41Citations
N/AReaders
Get full text

Claudin-6 is a single prognostic marker and functions as a tumor-promoting gene in a subgroup of intestinal type gastric cancer

40Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Tan, Y. D., & Xu, H. (2014). A general method for accurate estimation of false discovery rates in identification of differentially expressed genes. Bioinformatics, 30(14), 2018–2025. https://doi.org/10.1093/bioinformatics/btu124

Readers over time

‘14‘15‘16‘17‘18‘19‘20‘23‘2405101520

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 12

36%

Researcher 12

36%

Professor / Associate Prof. 8

24%

Lecturer / Post doc 1

3%

Readers' Discipline

Tooltip

Agricultural and Biological Sciences 20

61%

Biochemistry, Genetics and Molecular Bi... 6

18%

Computer Science 5

15%

Chemistry 2

6%

Save time finding and organizing research with Mendeley

Sign up for free
0