Mathematical modeling of avidity distribution and estimating general binding properties of transcription factors from genome-wide binding profiles

Vladimir A. Kuznetsov

Book Chapter

Mathematical modeling of avidity distribution and estimating general binding properties of transcription factors from genome-wide binding profiles

Kuznetsov V

Humana Press Inc., (2017), 193-276

DOI: 10.1007/978-1-4939-7027-8_9

4Citations

3Readers

Get full text

Abstract

The shape of the experimental frequency distributions (EFD) of diverse molecular interaction events quantifying genome-wide binding is often skewed to the rare but abundant quantities. Such distributions are systematically deviated from standard power-law functions proposed by scale-free network models suggesting that more explanatory and predictive probabilistic model(s) are needed. Identification of the mechanism-based data-driven statistical distributions that provide an estimation and prediction of binding properties of transcription factors from genome-wide binding profiles is the goal of this analytical survey. Here, we review and develop an analytical framework for modeling, analysis, and prediction of transcription factor (TF) DNA binding properties detected at the genome scale. We introduce a mixture probabilistic model of binding avidity function that includes nonspecific and specific binding events. A method for decomposition of specific and nonspecific TF–DNA binding events is proposed. We show that the Kolmogorov–Waring (KW) probability function (PF), modeling the steady state TF binding–dissociation stochastic process, fits well with the EFD for diverse TF–DNA binding datasets. Furthermore, this distribution predicts total number of TF–DNA binding sites (BSs), estimating specificity and sensitivity as well as other basic statistical features of DNA-TF binding when the experimental datasets are noise-rich and essentially incomplete. The KW distribution fits equally well to TF–DNA binding activity for different TFs including ERE, CREB, STAT1, Nanog, and Oct4. Our analysis reveals that the KW distribution and its generalized form provides the family of power-law-like distributions given in terms of hypergeometric series functions, including standard and generalized Pareto and Waring distributions, providing flexible and common skewed forms of the transcription factor binding site (TFBS) avidity distribution function. We suggest that the skewed binding events may be due to a wide range of evolutionary processes of creating weak avidity TFBS associated with random mutations, while the rare high-avidity binding sites (i.e., high-avidity evolutionarily conserved canonical e-boxes) rarely occurred. These, however, may be positively selected in microevolution.

Author supplied keywords

Cite

CITATION STYLE

APA

Kuznetsov, V. A. (2017). Mathematical modeling of avidity distribution and estimating general binding properties of transcription factors from genome-wide binding profiles. In Methods in Molecular Biology (Vol. 1613, pp. 193–276). Humana Press Inc. https://doi.org/10.1007/978-1-4939-7027-8_9

Mathematical modeling of avidity distribution and estimating general binding properties of transcription factors from genome-wide binding profiles

Abstract

Author supplied keywords

Cite

Register to see more suggestions