In order to identify putative control signals of gene expression, 634 mammalian DNA sequences spanning 1.8 × 106base-pairs were analysed and the frequencies of 1024 oligonucleotides five bases long (5-tuples) were determined. We defined as rare those 5-tuples having an observed frequency less than 50% of that expected by chance on the basis of base composition, and which had a reduction in frequency not attributable to CpG suppression or to coding constraints. Very few rare 5-tuples were identified; in addition, three oligonucleotides, reverse complements of rare 5-tuples, were found to have a frequency ranging between 0.582 and 0.671. The frequency of most of the rare 5-tuples was higher in 5′ promoter regions as compared to exonic segments, so imitating the distribution pattern of known signals. Some of the rare 5-tuples identified by this strategy belonged to a portion of the nine base-pair binding site in promoters, which is also known as the octamer motif. In addition, three of the rare oligonucleotides were found to be located within other regulatory elements, previously identified by techniques of molecular biology. Two rare 5-tuples were found within sites of interaction between DNA and proteins, one of them being a transcriptional factor. The available data about known control sequences involved in gene expression in mammals therefore provide evidence for a role in gene regulation of the rare oligonucleotide selected. © 1988.
Volinia, S., Bernardi, F., Gambari, R., & Barrai, I. (1988). Co-localization of rare oligonucleotides and regulatory elements in mammalian upstream gene regions. Journal of Molecular Biology, 203(2), 385–390. https://doi.org/10.1016/0022-2836(88)90006-X