scSensitiveGeneDefine: sensitive gene detection in single-cell RNA sequencing data by Shannon entropy

5Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Background: Single-cell RNA sequencing (scRNA-seq) is the most widely used technique to obtain gene expression profiles from complex tissues. Cell subsets and developmental states are often identified via differential gene expression patterns. Most of the single-cell tools utilized highly variable genes to annotate cell subsets and states. However, we have discovered that a group of genes, which sensitively respond to environmental stimuli with high coefficients of variation (CV), might impose overwhelming influences on the cell type annotation. Result: In this research, we developed a method, based on the CV-rank and Shannon entropy, to identify these noise genes, and termed them as “sensitive genes”. To validate the reliability of our methods, we applied our tools in 11 single-cell data sets from different human tissues. The results showed that most of the sensitive genes were enriched pathways related to cellular stress response. Furthermore, we noticed that the unsupervised result was closer to the ground-truth cell labels, after removing the sensitive genes detected by our tools. Conclusion: Our study revealed the prevalence of stochastic gene expression patterns in most types of cells, compared the differences among cell marker genes, housekeeping genes (HK genes), and sensitive genes, demonstrated the similarities of functions of sensitive genes in various scRNA-seq data sets, and improved the results of unsupervised clustering towards the ground-truth labels. We hope our method would provide new insights into the reduction of data noise in scRNA-seq data analysis and contribute to the development of better scRNA-seq unsupervised clustering algorithms in the future.

Cite

CITATION STYLE

APA

Chen, Z., Yang, Z., Yuan, X., Zhang, X., & Hao, P. (2021). scSensitiveGeneDefine: sensitive gene detection in single-cell RNA sequencing data by Shannon entropy. BMC Bioinformatics, 22(1). https://doi.org/10.1186/s12859-021-04136-1

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free