An algorithm for score aggregation over causal biological networks based on random walk sampling

4Citations
Citations of this article
17Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Background: We recently published in BMC Systems Biology an approach for calculating the perturbation amplitudes of causal network models by integrating gene differential expression data. This approach relies on the process of score aggregation, which combines the perturbations at the level of the individual network nodes into a global measure that quantifies the perturbation of the network as a whole. Such "bottom-up" aggregation relates the changes in molecular entities measured by omics technologies to systems-level phenotypes. However, the aggregation method we used is limited to a specific class of causal network models called "causally consistent", which is equivalent to the notion of balance of a signed graph used in graph theory. As a consequence of this limitation, our aggregation method cannot be used in the many relevant cases involving "causally inconsistent" network models such as those containing negative feedbacks. Findings: In this note, we propose an algorithm called "sampling of spanning trees" (SST) that extends our published aggregation method to causally inconsistent network models by replacing the signed relationships between the network nodes by an appropriate continuous measure. The SST algorithm is based on spanning trees, which are a particular class of subgraphs used in graph theory, and on a sampling procedure leveraging the properties of specific random walks on the graph. This algorithm is applied to several cases of biological interest. Conclusions: The SST algorithm provides a practical means of aggregating nodal values over causally inconsistent network models based on solid mathematical foundations. We showed its utility in systems biology, where the nodal values can be perturbation amplitudes of protein activities or gene differential expressions, while the networks can be models of cellular signaling or expression regulation. Since the SST algorithm is based on general graph-theoretical considerations, it is scalable to arbitrary graph sizes and can potentially be used for performing quantitative analyses in any context involving signed graphs.

Figures

  • Figure 1 Causally inconsistent biological networks, spanning trees, and results of the SST algorithm. (A) The incoherent feed-forward loop (IFFL) as an example of a causally inconsistent network, termed an “unbalanced graph” in graph theory. (B) The three spanning trees corresponding to the IFFL shown in (A). (C) Magnification of neighborhood of the TXNIP feedback loop from the “Hypoxic Stress” network. The effective node weights Sn→ REF from SST are indicated in the boxes, and the red X indicates the edge that is absent in the pruned causally consistent version of the network. (D) Receiver operating characteristic (ROC) curve (true positive rate vs. false positive rate) for the comparisons between the effective node weights Sn→ REF from SST and the corresponding nodal signs sn→ REF for the 19 networks given in Additional file 1: Table S1. The color of the curve follows the prediction threshold applied on Sn→ REF and shows that mislabeling occurs mainly for small values around zero (i.e. the green part of the curve). The area under the ROC curve (AUROC) is 0.992.
  • Figure 2 Evaluation of the SST algorithm at the level of the NPA scores. The Pearson correlation coefficients were calculated between the 16 pairs of GPI NPA scores corresponding to the causally inconsistent and pruned causally consistent network versions obtained for the 16 treatment vs. control comparisons contained in the TNF dataset. Only eight network models that were compatible with the tissue context of NHBE cells were considered. The low score correlation for the “Notch” network is consistent with the lack of significant scores for this network, while the poor score correlation for “Replicative Senescence” can be understood in light of the different effective nodal weights for nodes in a region of the network describing MAPK signaling.

References Powered by Scopus

Get full text

This article is free to access.

815Citations
843Readers

Cited by Powered by Scopus

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Vasilyev, D. M., Thomson, T. M., Frushour, B. P., Martin, F., & Sewer, A. (2014). An algorithm for score aggregation over causal biological networks based on random walk sampling. BMC Research Notes, 7(1). https://doi.org/10.1186/1756-0500-7-516

Readers over time

‘15‘16‘17‘18‘19‘20‘21‘2200.751.52.253

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 7

58%

Researcher 4

33%

Professor / Associate Prof. 1

8%

Readers' Discipline

Tooltip

Agricultural and Biological Sciences 5

50%

Computer Science 2

20%

Biochemistry, Genetics and Molecular Bi... 2

20%

Pharmacology, Toxicology and Pharmaceut... 1

10%

Save time finding and organizing research with Mendeley

Sign up for free
0