PaSiT: A novel approach based on short-oligonucleotide frequencies for efficient bacterial identification and typing

8Citations
Citations of this article
30Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Motivation: One of the most widespread methods used in taxonomy studies to distinguish between strains or taxa is the calculation of average nucleotide identity. It requires a computationally expensive alignment step and is therefore not suitable for large-scale comparisons. Short oligonucleotide-based methods do offer a faster alternative but at the expense of accuracy. Here, we aim to address this shortcoming by providing a software that implements a novel method based on short-oligonucleotide frequencies to compute inter-genomic distances. Results: Our tetranucleotide and hexanucleotide implementations, which were optimized based on a taxonomically well-defined set of over 200 newly sequenced bacterial genomes, are as accurate as the short oligonucleotide-based method TETRA and average nucleotide identity, for identifying bacterial species and strains, respectively. Moreover, the lightweight nature of this method makes it applicable for large-scale analyses.

Cite

CITATION STYLE

APA

Goussarov, G., Goussarov, G., Cleenwerck, I., Mysara, M., Leys, N., Monsieurs, P., … Van Houdt, R. (2020). PaSiT: A novel approach based on short-oligonucleotide frequencies for efficient bacterial identification and typing. Bioinformatics, 36(8), 2337–2344. https://doi.org/10.1093/bioinformatics/btz964

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free