Subsystems-based servers for rapid annotation of genomes and metagenomes
Available from
Ramy Aziz's profile on Mendeley.
Page 1
Subsystems-based servers for rapid annotation of genomes and metagenomes
ORAL PRESENTATION Open Access
Subsystems-based servers for rapid annotation of
genomes and metagenomes
Ramy Karam Aziz1,2
From UT-ORNL-KBRIN Bioinformatics Summit 2010
Cadiz, KY, USA. 19-21 March 2010
Background
Today, more than 1000 genomes of cellular organisms,
mostly microbes, have been completely sequenced and
deposited in public databases, in addition to over 2000
viral genomes, and these numbers are expected to sky-
rocket in the near future. While sequencing projects
remain largely biased towards genomes linked to human
interests [1] (e.g., domestic animals and plants, microbial
pathogens, and microbes exploited in industry and agri-
culture), some serious initiatives are being launched for
sequencing organisms that represent all branches of the
tree of life [2].
Concomitant with the genomic revolution, unprece-
dented advances in sequencing technology have also led
to the emergence of the field of metagenomics, which
offers a novel, revolutionary approach for studying
(microscopic) life in different environments. Metage-
nomics allows investigators to assess the biodiversity in
a given ecosystem by directly sequencing DNA sampled
from that ecosystem [3-5]. As so-called next-generation
sequencing technologies evolve, producing tremendous
amounts of data [6], the existing tools for sequence
annotation are not fast enough to cope with the techno-
logical advances. Consequently, manual annotation has
almost become impossible; however, automated annota-
tion tools often lead to error propagation and biologi-
cally irrelevant ontologies.
Materials and methods
Here, I demonstrate how the use of the subsystems [7]
and FIGfams [8,9] technologies, initiated by the Fellow-
ship for Interpretation of Genomes (FIG) and the Uni-
versity of Chicago National Microbial Pathogen Data
Resource (NMPDR) project [10], has improved the
accuracy and consistency of genome and metagenome
annotation [11]. Using subsystems allows the combina-
tion of careful human annotation and the rapid compu-
tational propagation of assertions made by human
experts through the RAST [8] pipeline for genome
annotation, the MG-RAST server for metagenome anno-
tation [12], and Phage-RAST for phage genome annota-
tion (work in progress).
Results and conclusion
Still, although these servers offer relatively rapid
annotation, the increasing throughput of sequencing
platforms requires even faster pipelines, and annotat-
ing a large metagenomic data set can take weeks
to months. To address this challenge, researchers at
San Diego State University, FIG, and the Argonne
National Laboratory are developing a protein family
signature-based technology (Robert A. Edwards, Ross
Overbeek, et al. submitted) to reduce the annotation
speed by an order of magnitude and create a real-time
annotation server (URL: http://edwards.sdsu.edu/
rtmg). Such server will not only improve speed, but
will allow the implementation of annotation pipelines
on cell phones (Josh Hoffman et al., unpublished
data) and social networks (Daniel Cuevas et al.,
unpublished data).
Acknowledgements
I thank Dr. Robert A. Edwards for sharing details about work in progress
performed in his laboratory (URL: http://edwards.sdsu.edu/labsite) at San
Diego State University, San Diego, CA, USA.
Author details
1Department of Microbiology and Immunology, Faculty of Pharmacy, Cairo
University, Cairo, Egypt. 2San Diego State University, San Diego, CA 92182,
USA.
Published: 23 July 2010Correspondence: ramy.aziz@salmonella.org
1Department of Microbiology and Immunology, Faculty of Pharmacy, Cairo
University, Cairo, Egypt
Aziz BMC Bioinformatics 2010, 11(Suppl 4):O2
http://www.biomedcentral.com/1471-2105/11/S4/O2
© 2010 Aziz; licensee BioMed Central Ltd.
Subsystems-based servers for rapid annotation of
genomes and metagenomes
Ramy Karam Aziz1,2
From UT-ORNL-KBRIN Bioinformatics Summit 2010
Cadiz, KY, USA. 19-21 March 2010
Background
Today, more than 1000 genomes of cellular organisms,
mostly microbes, have been completely sequenced and
deposited in public databases, in addition to over 2000
viral genomes, and these numbers are expected to sky-
rocket in the near future. While sequencing projects
remain largely biased towards genomes linked to human
interests [1] (e.g., domestic animals and plants, microbial
pathogens, and microbes exploited in industry and agri-
culture), some serious initiatives are being launched for
sequencing organisms that represent all branches of the
tree of life [2].
Concomitant with the genomic revolution, unprece-
dented advances in sequencing technology have also led
to the emergence of the field of metagenomics, which
offers a novel, revolutionary approach for studying
(microscopic) life in different environments. Metage-
nomics allows investigators to assess the biodiversity in
a given ecosystem by directly sequencing DNA sampled
from that ecosystem [3-5]. As so-called next-generation
sequencing technologies evolve, producing tremendous
amounts of data [6], the existing tools for sequence
annotation are not fast enough to cope with the techno-
logical advances. Consequently, manual annotation has
almost become impossible; however, automated annota-
tion tools often lead to error propagation and biologi-
cally irrelevant ontologies.
Materials and methods
Here, I demonstrate how the use of the subsystems [7]
and FIGfams [8,9] technologies, initiated by the Fellow-
ship for Interpretation of Genomes (FIG) and the Uni-
versity of Chicago National Microbial Pathogen Data
Resource (NMPDR) project [10], has improved the
accuracy and consistency of genome and metagenome
annotation [11]. Using subsystems allows the combina-
tion of careful human annotation and the rapid compu-
tational propagation of assertions made by human
experts through the RAST [8] pipeline for genome
annotation, the MG-RAST server for metagenome anno-
tation [12], and Phage-RAST for phage genome annota-
tion (work in progress).
Results and conclusion
Still, although these servers offer relatively rapid
annotation, the increasing throughput of sequencing
platforms requires even faster pipelines, and annotat-
ing a large metagenomic data set can take weeks
to months. To address this challenge, researchers at
San Diego State University, FIG, and the Argonne
National Laboratory are developing a protein family
signature-based technology (Robert A. Edwards, Ross
Overbeek, et al. submitted) to reduce the annotation
speed by an order of magnitude and create a real-time
annotation server (URL: http://edwards.sdsu.edu/
rtmg). Such server will not only improve speed, but
will allow the implementation of annotation pipelines
on cell phones (Josh Hoffman et al., unpublished
data) and social networks (Daniel Cuevas et al.,
unpublished data).
Acknowledgements
I thank Dr. Robert A. Edwards for sharing details about work in progress
performed in his laboratory (URL: http://edwards.sdsu.edu/labsite) at San
Diego State University, San Diego, CA, USA.
Author details
1Department of Microbiology and Immunology, Faculty of Pharmacy, Cairo
University, Cairo, Egypt. 2San Diego State University, San Diego, CA 92182,
USA.
Published: 23 July 2010Correspondence: ramy.aziz@salmonella.org
1Department of Microbiology and Immunology, Faculty of Pharmacy, Cairo
University, Cairo, Egypt
Aziz BMC Bioinformatics 2010, 11(Suppl 4):O2
http://www.biomedcentral.com/1471-2105/11/S4/O2
© 2010 Aziz; licensee BioMed Central Ltd.
Page 2
References
1. Aziz RK: The case for biocentric microbiology. Gut Pathog 2009, 1:16.
2. Wu D, Hugenholtz P, Mavromatis K, Pukall R, Dalin E, Ivanova NN, Kunin V,
Goodwin L, Wu M, Tindall BJ, et al: A phylogeny-driven genomic
encyclopaedia of Bacteria and Archaea. Nature 2009, 462:1056-1060.
3. Riesenfeld CS, Schloss PD, Handelsman J: Metagenomics: genomic analysis
of microbial communities. Annu Rev Genet 2004, 38:525-552.
4. Handelsman J: Metagenomics: application of genomics to uncultured
microorganisms. Microbiol Mol Biol Rev 2004, 68:669-685.
5. Edwards RA, Rohwer F: Viral metagenomics. Nat Rev Microbiol 2005,
3:504-510.
6. Schuster SC: Next-generation sequencing transforms today’s biology. Nat
Methods 2008, 5:16-18.
7. Overbeek R, Begley T, Butler RM, Choudhuri JV, Chuang HY, Cohoon M, de
Crecy-Lagard V, Diaz N, Disz T, Edwards R, et al: The subsystems approach
to genome annotation and its use in the project to annotate 1000
genomes. Nucleic Acids Res 2005, 33:5691-5702.
8. Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, Formsma K,
Gerdes S, Glass EM, Kubal M, et al: The RAST Server: rapid annotations
using subsystems technology. BMC Genomics 2008, 9:75.
9. Meyer F, Overbeek R, Rodriguez A: FIGfams: yet another set of protein
families. Nucleic Acids Res 2009, 37:6643-6654.
10. McNeil LK, Reich C, Aziz RK, Bartels D, Cohoon M, Disz T, Edwards RA,
Gerdes S, Hwang K, Kubal M, et al: The National Microbial Pathogen
Database Resource (NMPDR): a genomics platform based on subsystem
annotation. Nucleic Acids Res 2007, 35:D347-353.
11. Overbeek R, Bartels D, Vonstein V, Meyer F: Annotation of bacterial and
archaealgenomes: improving accuracy and consistency. Chem Rev 2007,
107:3431-3447.
12. Meyer F, Paarmann D, D’Souza M, Olson R, Glass EM, Kubal M, Paczian T,
Rodriguez A, Stevens R, Wilke A, et al: The metagenomics RAST server - a
public resource for the automatic phylogenetic and functional analysis
of metagenomes. BMC Bioinformatics 2008, 9:386.
doi:10.1186/1471-2105-11-S4-O2
Cite this article as: Aziz: Subsystems-based servers for rapid annotation
of genomes and metagenomes. BMC Bioinformatics 2010 11(Suppl 4):O2.
Submit your next manuscript to BioMed Central
and take full advantage of:
• Convenient online submission
• Thorough peer review
• No space constraints or color figure charges
• Immediate publication on acceptance
• Inclusion in PubMed, CAS, Scopus and Google Scholar
• Research which is freely available for redistribution
Submit your manuscript at
www.biomedcentral.com/submit
Aziz BMC Bioinformatics 2010, 11(Suppl 4):O2
http://www.biomedcentral.com/1471-2105/11/S4/O2
Page 2 of 2
1. Aziz RK: The case for biocentric microbiology. Gut Pathog 2009, 1:16.
2. Wu D, Hugenholtz P, Mavromatis K, Pukall R, Dalin E, Ivanova NN, Kunin V,
Goodwin L, Wu M, Tindall BJ, et al: A phylogeny-driven genomic
encyclopaedia of Bacteria and Archaea. Nature 2009, 462:1056-1060.
3. Riesenfeld CS, Schloss PD, Handelsman J: Metagenomics: genomic analysis
of microbial communities. Annu Rev Genet 2004, 38:525-552.
4. Handelsman J: Metagenomics: application of genomics to uncultured
microorganisms. Microbiol Mol Biol Rev 2004, 68:669-685.
5. Edwards RA, Rohwer F: Viral metagenomics. Nat Rev Microbiol 2005,
3:504-510.
6. Schuster SC: Next-generation sequencing transforms today’s biology. Nat
Methods 2008, 5:16-18.
7. Overbeek R, Begley T, Butler RM, Choudhuri JV, Chuang HY, Cohoon M, de
Crecy-Lagard V, Diaz N, Disz T, Edwards R, et al: The subsystems approach
to genome annotation and its use in the project to annotate 1000
genomes. Nucleic Acids Res 2005, 33:5691-5702.
8. Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, Formsma K,
Gerdes S, Glass EM, Kubal M, et al: The RAST Server: rapid annotations
using subsystems technology. BMC Genomics 2008, 9:75.
9. Meyer F, Overbeek R, Rodriguez A: FIGfams: yet another set of protein
families. Nucleic Acids Res 2009, 37:6643-6654.
10. McNeil LK, Reich C, Aziz RK, Bartels D, Cohoon M, Disz T, Edwards RA,
Gerdes S, Hwang K, Kubal M, et al: The National Microbial Pathogen
Database Resource (NMPDR): a genomics platform based on subsystem
annotation. Nucleic Acids Res 2007, 35:D347-353.
11. Overbeek R, Bartels D, Vonstein V, Meyer F: Annotation of bacterial and
archaealgenomes: improving accuracy and consistency. Chem Rev 2007,
107:3431-3447.
12. Meyer F, Paarmann D, D’Souza M, Olson R, Glass EM, Kubal M, Paczian T,
Rodriguez A, Stevens R, Wilke A, et al: The metagenomics RAST server - a
public resource for the automatic phylogenetic and functional analysis
of metagenomes. BMC Bioinformatics 2008, 9:386.
doi:10.1186/1471-2105-11-S4-O2
Cite this article as: Aziz: Subsystems-based servers for rapid annotation
of genomes and metagenomes. BMC Bioinformatics 2010 11(Suppl 4):O2.
Submit your next manuscript to BioMed Central
and take full advantage of:
• Convenient online submission
• Thorough peer review
• No space constraints or color figure charges
• Immediate publication on acceptance
• Inclusion in PubMed, CAS, Scopus and Google Scholar
• Research which is freely available for redistribution
Submit your manuscript at
www.biomedcentral.com/submit
Aziz BMC Bioinformatics 2010, 11(Suppl 4):O2
http://www.biomedcentral.com/1471-2105/11/S4/O2
Page 2 of 2
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime
Start using Mendeley in seconds!
Readership Statistics
3 Readers on Mendeley
by Discipline
100% Biological Sciences
by Academic Status
33% Lecturer
33% Ph.D. Student
33% Assistant Professor
by Country
33% Brazil
33% France
33% United States


