Using QC-blind for quality control and contamination screening of bacteria DNA sequencing data without reference genome

Wang Xi; Yan Gao; Zhangyu Cheng; Chaoyun Chen; Maozhen Han; Pengshuo Yang; Guangzhou Xiong; Kang Ning

Journal ArticleOPEN ACCESS

Using QC-blind for quality control and contamination screening of bacteria DNA sequencing data without reference genome

Frontiers in Microbiology (2019) 10(JULY)

DOI: 10.3389/fmicb.2019.01560

10Citations

18Readers

Abstract

Quality control for next generation sequencing (NGS) has become increasingly important with the ever increasing importance of sequencing data for omics studies. Tools have been developed for filtering possible contaminants from species with known reference genome. Unfortunately, reference genomes for all the species involved, including the contaminants, are required for these tools to work. This precludes many real-life samples that have no information about the complete genome of the target species, and are contaminated with unknown microbial species. In this work we proposed QC-Blind, a novel quality control pipeline for removing contaminants without any use of reference genomes. The pipeline merely requires the information about a few marker genes of the target species. The entire pipeline consists of unsupervised read assembly, contig binning, read clustering, and marker gene assignment. When evaluated on in silico, ab initio and in vivo datasets, QC-Blind proved effective in removing unknown contaminants with high specificity and accuracy, while preserving most of the genomic information of the target bacterial species. Therefore, QC-Blind could serve well in situations where limited information is available for both target and contamination species.

Author supplied keywords

Cite

CITATION STYLE

APA

Xi, W., Gao, Y., Cheng, Z., Chen, C., Han, M., Yang, P., … Ning, K. (2019). Using QC-blind for quality control and contamination screening of bacteria DNA sequencing data without reference genome. Frontiers in Microbiology, 10(JULY). https://doi.org/10.3389/fmicb.2019.01560

Using QC-blind for quality control and contamination screening of bacteria DNA sequencing data without reference genome

Abstract

Author supplied keywords

Cite

Register to see more suggestions