A study on fast calling variants from next-generation sequencing data using decision tree

14Citations
Citations of this article
57Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Background: The rapid development of next-generation sequencing (NGS) technology has continuously been refreshing the throughput of sequencing data. However, due to the lack of a smart tool that is both fast and accurate, the analysis task for NGS data, especially those with low-coverage, remains challenging. Results: We proposed a decision-tree based variant calling algorithm. Experiments on a set of real data indicate that our algorithm achieves high accuracy and sensitivity for SNVs and indels and shows good adaptability on low-coverage data. In particular, our algorithm is obviously faster than 3 widely used tools in our experiments. Conclusions: We implemented our algorithm in a software named Fuwa and applied it together with 4 well-known variant callers, i.e., Platypus, GATK-UnifiedGenotyper, GATK-HaplotypeCaller and SAMtools, to three sequencing data sets of a well-studied sample NA12878, which were produced by whole-genome, whole-exome and low-coverage whole-genome sequencing technology respectively. We also conducted additional experiments on the WGS data of 4 newly released samples that have not been used to populate dbSNP.

Cite

CITATION STYLE

APA

Li, Z., Wang, Y., & Wang, F. (2018). A study on fast calling variants from next-generation sequencing data using decision tree. BMC Bioinformatics, 19(1). https://doi.org/10.1186/s12859-018-2147-9

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free