Since base composition of translational stop codons (TAG, TAA, and TGA) is biased toward a low G+C content, a differential density for these termination signals is expected in random DNA sequences of different base compositions. The expected length of reading frames (DNA segments of sense codons flanked by in-phase stop codons) in random sequences is thus a function of GC content. The analysis of DNA sequences from several genome databases stratified according to GC content reveals that the longest coding sequences - exons in vertebrates and genes in prokaryotes - are GC-rich, while the shortest ones are GC-poor. Exon lengthening in GC-rich vertebrate regions does not result, however, in longer vertebrate proteins, perhaps because of the lower number of exons in the genes located in these regions. The effects on coding-sequence lengths constitute a new evolutionary meaning for compositional variations in DNA GC content.
CITATION STYLE
Oliver, J. L., & Marín, A. (1996). A relationship between GC content and coding-sequence length. Journal of Molecular Evolution, 43(3), 216–223. https://doi.org/10.1007/BF02338829
Mendeley helps you to discover research relevant for your work.