Motivation: In an effort to identify potential programmed frameshift sites by statistical analysis, we explore the hypothesis that selective pressure would have rendered such sites underabundant and underrepresented in protein-coding sequences. We developed a computer program to compare the frequencies of k-length subsequences of nucleotides with the frequencies predicted by a zero order Markov chain determined by the codon bias of the same set of sequences. The program was used to calculate and evaluate the distribution of 7-base oligonucleotides in the 6000+ putative protein-coding sequences of S. cerevisiae preliminary to the laboratory testing of the most highly underrepresented oligos for frameshifting efficiency. Results: Among the most significant results is the finding that the heptanucleotides CUU-AGG-C and CUU-AGU-U, sites of the programmed +1 translational frameshifts required for the production in yeast of actin filament-binding protein ABP140 and telomerase subunit EST3, respectively, rank among the least represented of phase I heptanucleotides in the coding sequences of S. cerevisiae. Laboratory experiments demonstrated that other underrepresented heptanucleotides identified by the program, for example GGU-CAG-A, are also prone to significant translational frameshiffing, suggesting the possibility that genes containing other underrepresented heptamers may also encode transframe product.
CITATION STYLE
Shah, A. A., Giddings, M. C., Parvaz, J. B., Gesteland, R. F., Atkins, J. F., & Ivanov, I. P. (2002). Computational identification of putative programmed translational frameshift sites. Bioinformatics, 18(8), 1046–1053. https://doi.org/10.1093/bioinformatics/18.8.1046
Mendeley helps you to discover research relevant for your work.