Approximate runs - revisited

Gad M. Landau

Conference ProceedingsOPEN ACCESS

Approximate runs - revisited

Landau G

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2008) 5280 LNCS 2

DOI: 10.1007/978-3-540-89097-3_2

0Citations

2Readers

Abstract

The problem of finding repeats within a string is an important computational problem with applications in data compression and in the field of molecular biology. Both exact and inexact repeats occur frequently in the genome, and certain repeats are known to be related to human diseases. A multiple tandem repeat in a sequence S is a (periodic) substring r of S of the form r = u au′, where u (the period) is a prefix of r, u′ is a prefix of u and a ≥ 2. A run is a maximal (non-extendable) multiple tandem repeat. An approximate run is a run with errors (i.e. the repeated subsequences are similar but not identical). Many measures have been proposed that capture the similarity among all periods. We may measure the number of errors between consecutive periods, between all periods, or between each period and a consensus string. Another possible measure is the number of positions in the periods that may differ. In this talk I will survey a range of our results in this area. Various parts of this work are joint work with Maxime Crochemore, Gene Myers, Jeanette Schmidt and Dina Sokol.

Author supplied keywords

Prefix

Cite

CITATION STYLE

APA

Landau, G. M. (2008). Approximate runs - revisited. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5280 LNCS, p. 2). Springer Verlag. https://doi.org/10.1007/978-3-540-89097-3_2

Approximate runs - revisited

Abstract

Author supplied keywords

Cite

Register to see more suggestions