A mathematical method was developed in this study to determine tandem repeats in a DNA sequence. A multiple alignment of periods was calculated by direct optimization of the position-weight matrix (PWM) without using pairwise alignments or searching for similarity between periods. Random PWMs were used to develop a new mathematical algorithm for periodicity search. The developed algorithm was applied to analyze the DNA sequences of C. elegans genome. 25360 regions having a periodicity with length of 2 to 50 bases were found. On the average, a periodicity of ~4000 nucleotides was found to be associated with each region. A significant portion of the revealed regions have periods consisting of 10 and 11 nucleotides, multiple to 10 nucleotides and periods in the vicinity of 35 nucleotides. Only ~30% of the periods found were discovered early. This study discussed the origin of periodicity with insertions and deletions.
CITATION STYLE
Korotkov, E. V., & Korotkova, M. A. (2017). Search of regions with periodicity using random position weight matrices in the genome of c. Elegans. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10209 LNCS, pp. 445–456). Springer Verlag. https://doi.org/10.1007/978-3-319-56154-7_40
Mendeley helps you to discover research relevant for your work.