Roughly 3% of the human genome is composed of variable-number tandem repeats (VNTRs): arrays of motifs at least six bases. These loci are highly polymorphic, yet current approaches that define and merge variants based on alignment breakpoints do not capture their full diversity. Here we present a method vamos: VNTR Annotation using efficient Motif Sets that instead annotates VNTR using repeat composition under different levels of motif diversity. Using vamos we estimate 7.4–16.7 alleles per locus when applied to 74 haplotype-resolved human assemblies, compared to breakpoint-based approaches that estimate 4.0–5.5 alleles per locus.
CITATION STYLE
Ren, J., Gu, B., & Chaisson, M. J. P. (2023). vamos: variable-number tandem repeats annotation using efficient motif sets. Genome Biology, 24(1). https://doi.org/10.1186/s13059-023-03010-y
Mendeley helps you to discover research relevant for your work.