A comparison of transcriptome analysis methods with reference genome

22Citations
Citations of this article
114Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Background: The application of RNA-seq technology has become more extensive and the number of analysis procedures available has increased over the past years. Selecting an appropriate workflow has become an important issue for researchers in the field. Methods: In our study, six popular analytical procedures/pipeline were compared using four RNA-seq datasets from mouse, human, rat, and macaque, respectively. The gene expression value, fold change of gene expression, and statistical significance were evaluated to compare the similarities and differences among the six procedures. qRT-PCR was performed to validate the differentially expressed genes (DEGs) from all six procedures. Results: Cufflinks-Cuffdiff demands the highest computing resources and Kallisto-Sleuth demands the least. Gene expression values, fold change, p and q values of differential expression (DE) analysis are highly correlated among procedures using HTseq for quantification. For genes with medium expression abundance, the expression values determined using the different procedures were similar. Major differences in expression values come from genes with particularly high or low expression levels. HISAT2-StringTie-Ballgown is more sensitive to genes with low expression levels, while Kallisto-Sleuth may only be useful to evaluate genes with medium to high abundance. When the same thresholds for fold change and p value are chosen in DE analysis, StringTie-Ballgown produce the least number of DEGs, while HTseq-DESeq2, -edgeR or -limma generally produces more DEGs. The performance of Cufflinks-Cuffdiff and Kallisto-Sleuth varies in different datasets. For DEGs with medium expression levels, the biological verification rates were similar among all procedures. Conclusion: Results are highly correlated among RNA-seq analysis procedures using HTseq for quantification. Difference in gene expression values mainly come from genes with particularly high or low expression levels. Moreover, biological validation rates of DEGs from all six procedures were similar for genes with medium expression levels. Investigators can choose analytical procedures according to their available computer resources, or whether genes of high or low expression levels are of interest. If computer resources are abundant, one can utilize multiple procedures to obtain the intersection of results to get the most reliable DEGs, or to obtain a combination of results to get a more comprehensive DE profile for transcriptomes.

Cite

CITATION STYLE

APA

Liu, X., Zhao, J., Xue, L., Zhao, T., Ding, W., Han, Y., & Ye, H. (2022). A comparison of transcriptome analysis methods with reference genome. BMC Genomics, 23(1). https://doi.org/10.1186/s12864-022-08465-0

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free