SFA-SPA: A suffix array based short peptide assembler for metagenomic data

7Citations
Citations of this article
30Readers
Mendeley users who have this article in their library.

Abstract

Summary: The determination of protein sequences from a metagenomic dataset enables the study of metabolism and functional roles of the organisms that are present in the sampled microbial community. We had previously introduced algorithm and software for the accurate reconstruction of protein sequences from short peptides identified on nucleotide reads in a metagenomic dataset. Here, we present significant computational improvements to the short peptide assembly algorithm that make it practical to reconstruct proteins from large metagenomic datasets containing several hundred million reads, while maintaining accuracy. The improved computational efficiency is achieved using a suffix array data structure that allows for fast querying during the assembly process, and a significant redesign of assembly steps that enables multi-threaded execution. Availability and implementation: The program is available under the GPLv3 license from sourceforge. net/projects/spa-assembler.

References Powered by Scopus

Suffix arrays: A new method for on-line string searches

1367Citations
N/AReaders
Get full text

Metagenomic analyses: Past and future trends

555Citations
N/AReaders
Get full text

Linear-time longest-common-prefix computation in suffix arrays and its applications

379Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Protein-level assembly increases protein sequence recovery from metagenomic samples manyfold

241Citations
N/AReaders
Get full text

Discovering novel hydrolases from hot environments

41Citations
N/AReaders
Get full text

Metagenome and Metatranscriptome Analyses Using Protein Family Profiles

17Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Yang, Y., Zhong, C., & Yooseph, S. (2015). SFA-SPA: A suffix array based short peptide assembler for metagenomic data. In Bioinformatics (Vol. 31, pp. 1833–1835). Oxford University Press. https://doi.org/10.1093/bioinformatics/btv052

Readers' Seniority

Tooltip

Researcher 16

59%

PhD / Post grad / Masters / Doc 11

41%

Readers' Discipline

Tooltip

Agricultural and Biological Sciences 13

48%

Computer Science 7

26%

Biochemistry, Genetics and Molecular Bi... 5

19%

Immunology and Microbiology 2

7%

Save time finding and organizing research with Mendeley

Sign up for free