Abstract
The termination of mature eukaryotic mRNAs occurs at specific polyadenylation sites located downstream from stop codons in the 3′-untranslated region (UTR). An accurate delineation of these sites is essential for the study of 3′-UTR-based gene regulation and for the design of pertinent probes for transcriptome analysis. Although typical poly(A) sites are located between 0 and 2 kb from the stop codon, EST sequence analyses have identified sites located at unexpectedly long ranges (5-10 kb) in a number of genes. Here we perform a complete mapping of EST and full-length cDNA sequences on the mouse and human genome to observe putative poly(A) sites extending beyond annotated 3′-ends and into the intergenic regions. We introduce several quality parameters for poly(A) site prediction and train a classification tree to associate P-values to predicted sites. We observe a higher than background level of high-scoring sites up to 12-15 kb past the stop codon, both in human and mouse. This leads to an estimate of about 5000 human genes having unreported 3′-end extensions and about 3500 novel polyadenylated transcripts lying in present "intergenic" regions. These high-scoring, long-range poly(A) sites corresponding to novel transcripts and gene extensions should be incorporated into current human and mouse gene repositories. Published by Cold Spring Harbor Laboratory Press. Copyright © 2006 RNA Society.
Author supplied keywords
Cite
CITATION STYLE
Lopez, F., Granjeaud, S., Ara, T., Ghattas, B., & Gautheret, D. (2006). The disparate nature of “intergenic” polyadenylation sites. RNA, 12(10), 1794–1801. https://doi.org/10.1261/rna.136206
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.