Abstract
Some strings -the texts- are assumed to be randomly generated, according to a probability model that is either a Bernoulli model or a Markov model. A rare event is the over or under-representation of a word or a set of words. The aim of this paper is twofold. First, a single word is given. We study the tail distribution of the number of its occurrences. Sharp large deviation estimates are derived. Second, we assume that a given word is overrepresented. The conditional distribution of a second word is studied; formulae for the expectation and the variance are derived. In both cases, the formulae are precise and can be computed efficiently. These results have applications in computational biology, where a genome is viewed as a text. © 2004 Discrete Mathematics and Theoretical Computer Science (DMTCS).
Author supplied keywords
Cite
CITATION STYLE
Régnier, M., & Denise, A. (2004). Rare events and conditional events on random strings. Discrete Mathematics and Theoretical Computer Science, 6(2), 191–214. https://doi.org/10.46298/dmtcs.310
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.