Background: Technological advances in high throughput genome sequencing are making whole genome sequencing (WGS) available as a routine tool for bacterial typing. Standardized procedures for identification of relevant genes and of variation are needed to enable comparison between studies and over time. The core genes--the genes that are conserved in all (or most) members of a genus or species--are potentially good candidates for investigating genomic variation in phylogeny and epidemiology.Results: We identify a set of 2,882 core genes clusters based on 73 publicly available Salmonella enterica genomes and evaluate their value as typing targets, comparing whole genome typing and traditional methods such as 16S and MLST. A consensus tree based on variation of core genes gives much better resolution than 16S and MLST; the pan-genome family tree is similar to the consensus tree, but with higher confidence. The core genes can be divided into two categories: a few highly variable genes and a larger set of conserved core genes, with low variance. For the most variable core genes, the variance in amino acid sequences is higher than for the corresponding nucleotide sequences, suggesting that there is a positive selection towards mutations leading to amino acid changes.Conclusions: Genomic variation within the core genome is useful for investigating molecular evolution and providing candidate genes for bacterial genome typing. Identification of genes with different degrees of variation is important especially in trend analysis. © 2012 Leekitcharoenphon et al; licensee BioMed Central Ltd.
CITATION STYLE
Leekitcharoenphon, P., Lukjancenko, O., Friis, C., Aarestrup, F. M., & Ussery, D. W. (2012). Genomic variation in Salmonella enterica core genes for epidemiological typing. BMC Genomics, 13(1). https://doi.org/10.1186/1471-2164-13-88
Mendeley helps you to discover research relevant for your work.