Comparative analysis of de novo assemblers for variation discovery in personal genomes

11Citations
Citations of this article
56Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Current variant discovery approaches often rely on an initial read mapping to the reference sequence. Their effectiveness is limited by the presence of gaps, potential misassemblies, regions of duplicates with a high-sequence similarity and regions of high-sequence divergence in the reference. Also, mapping-based approaches are less sensitive to large INDELs and complex variations and provide little phase information in personal genomes. A few de novo assemblers have been developed to identify variants through direct variant calling from the assembly graph, micro-assembly and whole-genome assembly, but mainly for whole-genome sequencing (WGS) data. We developed SGVar, a de novo assembly workflow for haplotype-based variant discovery from whole-exome sequencing (WES) data. Using simulated human exome data, we compared SGVar with five variation-aware de novo assemblers and with BWA-MEM together with three haplotype- or local de novo assembly-based callers. SGVar outperforms the other assemblers in sensitivity and tolerance of sequencing errors. We recapitulated the findings on whole-genome and exome data from a Utah residents with Northern and Western European ancestry (CEU) trio, showing that SGVar had high sensitivity both in the highly divergent human leukocyte antigen (HLA) region and in non-HLA regions of chromosome 6. In particular, SGVar is robust to sequencing error, k-mer selection, divergence level and coverage depth. Unlike mapping-based approaches, SGVar is capable of resolving long-range phase and identifying large INDELs from WES, more prominently from WGS. We conclude that SGVar represents an ideal platform for WES-based variant discovery in highly divergent regions and across the whole genome.

Cite

CITATION STYLE

APA

Tian, S., Yan, H., Klee, E. W., Kalmbach, M., & Slager, S. L. (2018). Comparative analysis of de novo assemblers for variation discovery in personal genomes. Briefings in Bioinformatics, 19(5), 893–904. https://doi.org/10.1093/bib/bbx037

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free