On the combinatorics of sparsification

5Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Background: We study the sparsification of dynamic programming based on folding algorithms of RNA structures. Sparsification is a method that improves significantly the computation of minimum free energy (mfe) RNA structures.Results: We provide a quantitative analysis of the sparsification of a particular decomposition rule, Λ*. This rule splits an interval of RNA secondary and pseudoknot structures of fixed topological genus. Key for quantifying sparsifications is the size of the so called candidate sets. Here we assume mfe-structures to be specifically distributed (see Assumption 1) within arbitrary and irreducible RNA secondary and pseudoknot structures of fixed topological genus. We then present a combinatorial framework which allows by means of probabilities of irreducible sub-structures to obtain the expectation of the Λ*-candidate set w.r.t. a uniformly random input sequence. We compute these expectations for arc-based energy models via energy-filtered generating functions (GF) in case of RNA secondary structures as well as RNA pseudoknot structures. Furthermore, for RNA secondary structures we also analyze a simplified loop-based energy model. Our combinatorial analysis is then compared to the expected number of Λ*-candidates obtained from the folding mfe-structures. In case of the mfe-folding of RNA secondary structures with a simplified loop-based energy model our results imply that sparsification provides a significant, constant improvement of 91% (theory) to be compared to an 96% (experimental, simplified arc-based model) reduction. However, we do not observe a linear factor improvement. Finally, in case of the " full" loop-energy model we can report a reduction of 98% (experiment).Conclusions: Sparsification was initially attributed a linear factor improvement. This conclusion was based on the so called polymer-zeta property, which stems from interpreting polymer chains as self-avoiding walks. Subsequent findings however reveal that the O(n) improvement is not correct. The combinatorial analysis presented here shows that, assuming a specific distribution (see Assumption 1), of mfe-structures within irreducible and arbitrary structures, the expected number of Λ*-candidates is Θ(n2). However, the constant reduction is quite significant, being in the range of 96%. We furthermore show an analogous result for the sparsification of the Λ*-decomposition rule for RNA pseudoknotted structures of genus one. Finally we observe that the effect of sparsification is sensitive to the employed energy model. © 2012 Huang and Reidys; licensee BioMed Central Ltd.

Cite

CITATION STYLE

APA

Huang, F. W. D., & Reidys, C. M. (2012). On the combinatorics of sparsification. Algorithms for Molecular Biology, 7(1). https://doi.org/10.1186/1748-7188-7-28

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free