Significance Score of Motifs in Biological Sequences

  • Nuel G
N/ACitations
Citations of this article
7Readers
Mendeley users who have this article in their library.

Abstract

In Bioinformatics, it is common to search biological sequences (DNA, RNA, proteins) for functional motifs such as cross-over hotspot instigators (chi), restriction sites, regulation motifs, binding sites, active sites in proteins, etc. (Beaudoing et al., 2000; Brazma et al., 1998; El Karoui et al., 1999; Frith et al., 2002; Hampson et al., 2002; Karlin et al., 1992; Leonardo Marino-Ramirez & Landsman, 2004; van Helden et al., 1998). Due to evolution pressure, functional motifs are likely to be more conserved than non-functional motifs. As a consequence, it is a natural strategy to search biological sequences for motifs which are statistically exceptional (ex: overor under-represented). Given M a motif of interest (from simple strings to complex regular expressions), a recurrent question is: “how surprising is it to observe n occurrences of M in my dataset ”. In statistical terms, this is equivalent to compute the p-value of observation n in respect with a relevant reference model. More precisely, if X1:l = X1 . . .Xl is a length l random sequence generated by our reference model, and if N denotes the random number of occurrences ofM in X1:l, for any n 0 our objective is to compute the significance score of observation n:

Cite

CITATION STYLE

APA

Nuel, G. (2011). Significance Score of Motifs in Biological Sequences. In Bioinformatics - Trends and Methodologies. InTech. https://doi.org/10.5772/18448

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free