HierCC: a multi-level clustering scheme for population assignments based on core genome MLST

41Citations
Citations of this article
49Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Motivation: Routine infectious disease surveillance is increasingly based on large-scale whole-genome sequencing databases. Real-time surveillance would benefit from immediate assignments of each genome assembly to hierarchical population structures. Here we present pHierCC, a pipeline that defines a scalable clustering scheme, HierCC, based on core genome multi-locus typing that allows incremental, static, multi-level cluster assignments of genomes. We also present HCCeval, which identifies optimal thresholds for assigning genomes to cohesive HierCC clusters. HierCC was implemented in EnteroBase in 2018 and has since genotyped >530 000 genomes from Salmonella, Escherichia/Shigella, Streptococcus, Clostridioides, Vibrio and Yersinia.

Cite

CITATION STYLE

APA

Zhou, Z., Charlesworth, J., & Achtman, M. (2021). HierCC: a multi-level clustering scheme for population assignments based on core genome MLST. Bioinformatics, 37(20), 3645–3646. https://doi.org/10.1093/bioinformatics/btab234

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free