Evaluation of the expected frequency of occurrences of a given set of patterns in a DNA sequence has numerous applications and has been extensively studied recently. We provide a unified framework for this evaluation that adapts to various constraints and allow to extend previous results. We assume successively that the patterns may, then may not, overlap. We derive exact formulae for the moments in a Markovian model, that are linear functions of the size of the sequence. We show that our formulae, that occasionally simplify previous results, are computable at low cost, which makes them useful for practical applications. © 2000 Elsevier Science B.V.
Régnier, M. (2000). A unified approach to word occurrence probabilities. Discrete Applied Mathematics, 104(1–3), 259–280. https://doi.org/10.1016/S0166-218X(00)00195-5