MOTIVATION: Next generation sequencing methods are generating increasingly massive datasets, yet still do not fully capture genetic diversity in the richest environments. To understand such complicated and elusive systems, effective tools are needed to assist with delineating the differences found in and between community datasets. RESULTS: The Small Subunit Markov Modeler (SSuMMo) was developed to probabilistically assign SSU rRNA gene fragments from any sequence dataset to recognised taxonomic clades, producing consistent, comparable cladograms. Accuracy tests predicted over 90% of genera correctly for sequences downloaded from public reference databases. Sequences from a next generation sequence dataset, sampled from lean, overweight and obese individuals, were analyzed to demonstrate parallel visualisation of comparable datasets. SSuMMo shows potential as a valuable curatorial tool, as numerous incorrect and outdated taxonomic entries and annotations were identified in public databases.Availability and Implementation: SSuMMo is GPLv3 open source Python software, available at http://code.google.com/p/ssummo/. Taxonomy and HMM databases can be downloaded from http://bioltfws1.york.ac.uk/ssummo/. CONTACT: email@example.com SUPPLEMENTARY INFORMATION: Supplemental materials are available at Bioinformatics Online.
Mendeley saves you time finding and organizing research
Choose a citation style from the tabs below