Despite recent improvements in nanopore basecalling accuracy, germline variant calling of small insertions and deletions (INDELs) remains poor. Although precision and recall for single nucleotide polymorphisms (SNPs) now exceeds 99.5%, INDEL recall remains below 80% for standard R9.4.1 flow cells. We show that read phasing and realignment can recover a significant portion of false negative INDELs. In particular, we extend Needleman-Wunsch affine gap alignment by introducing new gap penalties for more accurately aligning repeated n-polymer sequences such as homopolymers (n= 1) and tandem repeats (2 ≤ n≤ 6). At the same precision, haplotype phasing improves INDEL recall from 63.76 to 70.66 % and nPoRe realignment improves it further to 73.04 %.
CITATION STYLE
Dunn, T., Blaauw, D., Das, R., & Narayanasamy, S. (2023). nPoRe: n-polymer realigner for improved pileup-based variant calling. BMC Bioinformatics, 24(1). https://doi.org/10.1186/s12859-023-05193-4
Mendeley helps you to discover research relevant for your work.