Ho w t ool combinations in different pipeline v ersions affect the outcome in RNA-seq analysis

8Citations
Citations of this article
23Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Data analysis tools are continuously changed and impro v ed o v er time. In order to test how these changes influence the comparability between analyses, the output of different w orkflo w options of the nf-core / rnaseq pipeline were compared. Five different pipeline settings (STAR+Salmon, S TAR+R SEM, S TAR+feat ureCounts, HIS AT2+feat ureCounts, pseudoaligner Salmon) were run on three datasets (human, Arabidopsis, zebrafish) containing spike-ins of the External RNA Control Consortium (ERCC). Fold change ratios and differential expression of genes and spike-ins were used for comparative analyses of the different tools and versions settings of the pipeline. An overlap of 85% for differential gene classification between pipelines could be shown. Genes interpreted with a bias were mostly those present at lo w er concentration. Also, the number of isoforms and exons per gene were determinants. Previous pipeline versions using featureCounts showed a higher sensitivity to detect one- isof orm genes lik e ER CC. To ensure dat a comparabilit y in long-term analysis series it would be recommendable to either stay with the pipeline version the series was initialized with or to run both versions during a transition time in order to ensure that the target genes are addressed the same w a y.

Cite

CITATION STYLE

APA

Perelo, L. W., Gabernet, G., Straub, D., & Nahnsen, S. (2024). Ho w t ool combinations in different pipeline v ersions affect the outcome in RNA-seq analysis. NAR Genomics and Bioinformatics, 6(1). https://doi.org/10.1093/nargab/lqae020

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free