Discriminating membrane proteins using the joint distribution of length sums of success and failure runs

0Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Discriminating integral membrane proteins from water-soluble ones, has been over the past decades an important goal for computational molecular biology. A major drawback of methods appeared in the literature, is that most of the authors tried to solve the problem using machine learning techniques. Specifically, most of the proposed methods require an appropriate dataset for training, and consequently the results depend heavily on the suitability of the dataset, itself. Motivated by these facts, in this paper we develop a formal discrimination procedure that is based on appropriate theoretical observations on the sequence of hydrophobic and polar residues along the protein sequence and on the exact distribution of a two dimensional runs-related statistic defined on the same sequence. Specifically, for setting up our discrimination procedure, we study thoroughly the exact distribution of a bivariate random variable, which accumulates the exact lengths of both success and failure runs of at least a specific length in a sequence of Bernoulli trials. To investigate the properties of this bivariate random variable, we use the Markov chain embedding technique. Finally, we apply the new procedure to a well-defined dataset of proteins.

Cite

CITATION STYLE

APA

Bersimis, S., Sachlas, A., & Bagos, P. G. (2017). Discriminating membrane proteins using the joint distribution of length sums of success and failure runs. Statistical Methods and Applications, 26(2), 251–272. https://doi.org/10.1007/s10260-016-0370-y

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free