One of the most common splice variations are small exon length variations caused by the use of alternative donor or acceptor splice sites that are in very close proximity on the pre-mRNA. Among these, three-nucleotide variations at so-called NAGNAG tandem acceptor sites have recently attracted considerable attention, and it has been suggested that these variations are regulated and serve to fine-tune protein forms by the addition or removal of a single amino acid. In this paper we first show that in-frame exon length variations are generally overrepresented and that this overrepresentation can be quantitatively explained by the effect of nonsense-mediated decay. Our analysis allows us to estimate that about 50% of frame-shifted coding transcripts are targeted by nonsense-mediated decay. Second, we show that a simple physical model that assumes that the splicing machinery stochastically binds to nearby splice sites in proportion to the affinities of the sites correctly predicts the relative abundances of different small length variations at both boundaries. Finally, using the same simple physical model, we show that for NAGNAG sites, the difference in affinities of the neighboring sites for the splicing machinery accurately predicts whether splicing will occur only at the first site, splicing will occur only at the second site, or three-nucleotide splice variants are likely to occur. Our analysis thus suggests that small exon length variations are the result of stochastic binding of the spliceosome at neighboring splice sites. Small exon length variations occur when there are nearby alternative splice sites that have similar affinity for the splicing machinery. © 2006 Chern et al.
Chern, T. M., Van Nimwegen, E., Kai, C., Kawai, J., Carninci, P., Hayashizaki, Y., & Zavolan, M. (2006). A simple physical model predicts small exon length variations. PLoS Genetics, 2(4), 606–613. https://doi.org/10.1371/journal.pgen.0020045