Systematic and benchmarking studies of pipelines for mammal WGBS data in the novel NGS platform

1Citations
Citations of this article
15Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Background: Whole genome bisulfite sequencing (WGBS), possesses the aptitude to dissect methylation status at the nucleotide-level resolution of 5-methylcytosine (5-mC) on a genome-wide scale. It is a powerful technique for epigenome in various cell types, and tissues. As a recently established next-generation sequencing (NGS) platform, GenoLab M is a promising alternative platform. However, its comprehensive evaluation for WGBS has not been reported. We sequenced two bisulfite-converted mammal DNA in this research using our GenoLab M and NovaSeq 6000, respectively. Then, we systematically compared those data via four widely used WGBS tools (BSMAP, Bismark, BatMeth2, BS-Seeker2) and a new bisulfite-seq tool (BSBolt). We interrogated their computational time, genome depth and coverage, and evaluated their percentage of methylated Cs. Result: Here, benchmarking a combination of pre- and post-processing methods, we found that trimming improved the performance of mapping efficiency in eight datasets. The data from two platforms uncovered ~ 80% of CpG sites genome-wide in the human cell line. Those data sequenced by GenoLab M achieved a far lower proportion of duplicates (~ 5.5%). Among pipelines, BSMAP provided an intriguing representation of 5-mC distribution at CpG sites with 5-mC levels > ~ 78% in datasets from human cell lines, especially in the GenoLab M. BSMAP performed more advantages in running time, uniquely mapped reads percentages, genomic coverage, and quantitative accuracy. Finally, compared with the previous methylation pattern of human cell line and mouse tissue, we confirmed that the data from GenoLab M performed similar consistency and accuracy in methylation levels of CpG sites with that from NovaSeq 6000. Conclusion: Together we confirmed that GenoLab M was a qualified NGS platform for WGBS with high performance. Our results showed that BSMAP was the suitable pipeline that allowed for WGBS studies on the GenoLab M platform.

Cite

CITATION STYLE

APA

Lin, Q. ting, Yang, W., Zhang, X., Li, Q. gang, Liu, Y. feng, Yan, Q., & Sun, L. (2023). Systematic and benchmarking studies of pipelines for mammal WGBS data in the novel NGS platform. BMC Bioinformatics, 24(1). https://doi.org/10.1186/s12859-023-05163-w

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free