Normalization by distributional resampling of high throughput single-cell RNA-sequencing data

14Citations
Citations of this article
29Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Motivation: Normalization to remove technical or experimental artifacts is critical in the analysis of single-cell RNA-sequencing experiments, even those for which unique molecular identifiers are available. The majority of methods for normalizing single-cell RNA-sequencing data adjust average expression for library size (LS), allowing the variance and other properties of the gene-specific expression distribution to be non-constant in LS. This often results in reduced power and increased false discoveries in downstream analyses, a problem which is exacerbated by the high proportion of zeros present in most datasets. Results: To address this, we present Dino, a normalization method based on a flexible negative-binomial mixture model of gene expression. As demonstrated in both simulated and case study datasets, by normalizing the entire gene expression distribution, Dino is robust to shallow sequencing, sample heterogeneity and varying zero proportions, leading to improved performance in downstream analyses in a number of settings.

Cite

CITATION STYLE

APA

Brown, J., Ni, Z., Mohanty, C., Bacher, R., & Kendziorski, C. (2021). Normalization by distributional resampling of high throughput single-cell RNA-sequencing data. Bioinformatics, 37(22), 4123–4128. https://doi.org/10.1093/bioinformatics/btab450

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free