A combined approach with gene-wise normalization improves the analysis of RNA-seq data in human breast cancer subtypes

6Citations
Citations of this article
29Readers
Mendeley users who have this article in their library.

Abstract

Breast cancer (BC) is increasing in incidence and resistance to treatment worldwide. The challenges in limited therapeutic options and poor survival outcomes in BC subtypes persist because of its molecular heterogeneity and resistance to standard endocrine therapy. Recently, high throughput RNA sequencing (RNA-seq) has been used to identify biomarkers of disease progression and signaling pathways that could be amenable to specific therapies according to the BC subtype. However, there is no single generally accepted pipeline for the analysis of RNA-seq data in biomarker discovery due, in part, to the needs of simultaneously satisfying constraints of sensitivity and specificity. We proposed a combined approach using gene-wise normalization, UQ-pgQ2, followed by a Wald test from DESeq2. Our approach improved the analysis based on within-group comparisons in terms of the specificity when applied to publicly available RNA-seq BC datasets. In terms of identifying differentially expressed genes (DEGs), we combined an optimized log2 fold change cutoff with a nominal false discovery rate of 0.05 to further minimize false positives. Using this method in the analysis of two GEO BC datasets, we identified 797 DEGs uniquely expressed in triple negative BC (TNBC) and significantly associated with T cell and immune-related signaling, contributing to the immunotherapeutic efficacy in TNBC patients. In contrast, we identified 1403 DEGs uniquely expressed in estrogen positive and HER2 negative BC (ER+HER2-BC) and significantly associated with eicosanoid, notching and FAK signaling while a common set of genes was associated with cellular growth and proliferation. Thus, our approach to control for false positives identified two distinct gene expression profiles associated with these two subtypes of BC which are distinguishable by their molecular and functional attributes.

Figures

  • Table 1. Summary of normalization methods and software packages used.
  • Table 2. DEG analysis performed via within-group and between-group comparisons from three methods. The DEGs from between–group comparisons in bold are determined given a FDR 0.05.
  • Table 3. Determining an optimal |logFC| by observed FPR. An observed FPR based on all of 35203 genes is computed given a |logFC| cutoff in parenthesis.
  • Table 4. DEGs identified using DESeq2 and UQ-pgQ2. The DEGs from 17,584 protein coding genes are determined given a nominal FDR 0.05 and an optimal | logFC| cutoff in Table 3.
  • Table 5. An approach to select DEGs (protein coding genes) identified by UQ-pgQ2 and DESeq2.
  • Fig 1. Hierarchical clustering heatmaps of BC based on the DESeq-normalized gene expression levels. The genes with similar expression patterns are clustered together. The up-regulated genes are in red and the down-regulated genes are in green. (A) A heatmap based on gene expression levels of 1,693 DEGs uniquely identified in TNBC data. (B) A heatmap based on gene expression of 2,299 DEGs uniquely identified in ER+HER2-BC data.
  • Table 6. Biomarkers identified for TNBC and ER+HER2-BC.
  • Table 7. The DEGs are associated with cancer biology identified by IPA.

References Powered by Scopus

Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2

54556Citations
N/AReaders
Get full text

STAR: Ultrafast universal RNA-seq aligner

29823Citations
N/AReaders
Get full text

edgeR: A Bioconductor package for differential expression analysis of digital gene expression data

28549Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Single cell RNA sequencing for breast cancer: present and future

30Citations
N/AReaders
Get full text

Choice of library size normalization and statistical methods for differential gene expression analysis in balanced two-group comparisons for RNA-seq studies

28Citations
N/AReaders
Get full text

Correlation between targeted RNAseq signature of breast cancer CTCs and onset of bone-only metastases

11Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Li, X., Rouchka, E. C., Brock, G. N., Yan, J., O’Toole, T. E., Tieri, D. A., & Cooper, N. G. F. (2018). A combined approach with gene-wise normalization improves the analysis of RNA-seq data in human breast cancer subtypes. PLoS ONE, 13(8). https://doi.org/10.1371/journal.pone.0201813

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 7

47%

Researcher 7

47%

Lecturer / Post doc 1

7%

Readers' Discipline

Tooltip

Agricultural and Biological Sciences 6

40%

Biochemistry, Genetics and Molecular Bi... 5

33%

Computer Science 2

13%

Mathematics 2

13%

Article Metrics

Tooltip
Mentions
Blog Mentions: 3

Save time finding and organizing research with Mendeley

Sign up for free