Fast algorithms for inferring gene-species associations

8Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Assessment of microbial biodiversity is typically made by sequencing either PCR-amplified marker genes or all genomic DNA from environmental samples. Both approaches rely on the similarity of the sequenced material to known entries in sequence databases. However, amplicons of non-marker genes are often used, when the research question aims at assessing both functional capabilities of a microbial community and its biodiversity. In such cases, a phylogenetic tree is constructed with known and metagenomic sequences, and expert assessment defines the taxonomic groups the amplicons belong to. Here, instead of relying on sequences, often missing, of non-marker genes, we use tree reconciliation to obtain a distribution of mappings between genes and species. We describe efficient algorithms for the reconstruction of gene-species mappings and a Monte-Carlo method for the inference of distributions for the cases when the number of optimal reconstructions is large. We provide a comparative study of different cost functions showing that the duplication-loss cost induces mappings of the highest quality. Further, we demonstrate the correctness of our approach using several datasets.

Cite

CITATION STYLE

APA

Betkier, A., Szczęsny, P., & Gorecki, P. (2015). Fast algorithms for inferring gene-species associations. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9096, pp. 36–47). Springer Verlag. https://doi.org/10.1007/978-3-319-19048-8_4

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free