SFA-SPA: A suffix array based short peptide assembler for metagenomic data

7Citations
Citations of this article
30Readers
Mendeley users who have this article in their library.

Abstract

Summary: The determination of protein sequences from a metagenomic dataset enables the study of metabolism and functional roles of the organisms that are present in the sampled microbial community. We had previously introduced algorithm and software for the accurate reconstruction of protein sequences from short peptides identified on nucleotide reads in a metagenomic dataset. Here, we present significant computational improvements to the short peptide assembly algorithm that make it practical to reconstruct proteins from large metagenomic datasets containing several hundred million reads, while maintaining accuracy. The improved computational efficiency is achieved using a suffix array data structure that allows for fast querying during the assembly process, and a significant redesign of assembly steps that enables multi-threaded execution. Availability and implementation: The program is available under the GPLv3 license from sourceforge. net/projects/spa-assembler.

Cite

CITATION STYLE

APA

Yang, Y., Zhong, C., & Yooseph, S. (2015). SFA-SPA: A suffix array based short peptide assembler for metagenomic data. In Bioinformatics (Vol. 31, pp. 1833–1835). Oxford University Press. https://doi.org/10.1093/bioinformatics/btv052

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free