To explore the rules for mammalian splice-site selection using a statistical approach, we constructed an aberrant splicing database containing an extensive collection of mammalian genetic disease mutations (90 genes, 209 mutations). From this database, we confirmed that: (1) more than 90% of mutations either destroy or create the splice-site consensus sequences; (2) the number of mutations mapped at individual residues in the splice-site regions roughly correlates to their conservation degrees in the consensus sequences; (3) about half of the observed aberrant splicing is exon skipping, while intron retention is rarely observed; (4) almost all of the major cryptic sites, activated by mutations, are mapped within an about 100-nt region from the authentic splice sites. Furthermore, we found that: (5) mutations are observed more frequently in the 5' splice-site region than in the 3' splice site region; (6) splice sites that are newly created by mutations are located upstream from the authentic splice sites. Hopefully, these observations will be used as rules for constructing a more effective prediction system of exon sequences. © 1994.
Mendeley saves you time finding and organizing research
Choose a citation style from the tabs below