Multiple, non-allelic, intein-coding sequences in eukaryotic RNA polymerase genes

15Citations
Citations of this article
49Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Background: Inteins are self-splicing protein elements. They are translated as inserts within host proteins that excise themselves and ligate the flanking portions of the host protein (exteins) with a peptide bond. They are encoded as in-frame insertions within the genes for the host proteins. Inteins are found in all three domains of life and in viruses, but have a very sporadic distribution. Only a small number of intein coding sequences have been identified in eukaryotic nuclear genes, and all of these are from ascomycete or basidiomycete fungi. Results: We identified seven intein coding sequences within nuclear genes coding for the second largest subunits of RNA polymerase. These sequences were found in diverse eukaryotes: one is in the second largest subunit of RNA polymerase I (RPA2) from the ascomycete fungus Phaeosphaeria nodorum, one is in the RNA polymerase III (RPC2) of the slime mould Dictyostelium discoideum and four intein coding sequences are in RNA polymerase II genes (RPB2), one each from the green alga Chlamydomonas reinhardtii, the zygomycete fungus Spiromyces aspiralis and the chytrid fungi Batrachochytrium dendrobatidis and Coelomomyces stegomyiae. The remaining intein coding sequence is in a viral relic embedded within the genome of the oomycete Phytophthoro ramorum. The Chlamydomonas and Dictyostelium inteins are the first nuclear-encoded inteins found outside of the fungi. These new inteins represent a unique dataset: they are found in homologous proteins that form a paralogous group. Although these paralogues diverged early in eukaryotic evolution, their sequences can be aligned over most of their length. The inteins are inserted at multiple distinct sites, each of which corresponds to a highly conserved region of RNA polymerase. This dataset supports earlier work suggesting that inteins preferentially occur in highly conserved regions of their host proteins. Conclusion: The identification of these new inteins increases the known host range of intein sequences in eukaryotes, and provides fresh insights into their origins and evolution. We conclude that inteins are ancient eukaryote elements once found widely among microbial eukaryotes. They persist as rarities in the genomes of a sporadic array of microorganisms, occupying highly conserved sites in diverse proteins. © 2006 Goodwin et al; licensee BioMed Central Ltd.

Figures

  • Table 1: Newly described inteins from the second largest subunit of RNA polymerases.
  • Table 2: The unclassified sequence from the Sargasso Sea is unlikely to represent a fragment of a viral genome. TBLASTN searches were conducted at NCBI using as a query the 59 residues from the Sargasso Sea sequence (Accession AACY01369547) that formed the putative C-extein. These 59 residues are encoded on the complementary strand, from base 556 to base 732. Each search was restricted to one of the six groups outlined below.

References Powered by Scopus

The CLUSTAL X windows interface: Flexible strategies for multiple sequence alignment aided by quality analysis tools

36636Citations
N/AReaders
Get full text

The Protein Data Bank

32223Citations
N/AReaders
Get full text

Environmental Genome Shotgun Sequencing of the Sargasso Sea

3405Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Giant virus with a remarkable complement of genes infects marine zooplankton

283Citations
N/AReaders
Get full text

Enigmatic distribution, evolution, and function of Inteins

88Citations
N/AReaders
Get full text

Conservation of intron and intein insertion sites: Implications for life histories of parasitic genetic elements

44Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Goodwin, T. J. D., Butler, M. I., & Poulter, R. T. M. (2006). Multiple, non-allelic, intein-coding sequences in eukaryotic RNA polymerase genes. BMC Biology, 4. https://doi.org/10.1186/1741-7007-4-38

Readers over time

‘10‘11‘12‘13‘14‘15‘16‘17‘18‘19‘20‘21‘22‘24036912

Readers' Seniority

Tooltip

Researcher 18

43%

PhD / Post grad / Masters / Doc 17

40%

Professor / Associate Prof. 6

14%

Lecturer / Post doc 1

2%

Readers' Discipline

Tooltip

Agricultural and Biological Sciences 33

79%

Biochemistry, Genetics and Molecular Bi... 6

14%

Environmental Science 2

5%

Chemistry 1

2%

Save time finding and organizing research with Mendeley

Sign up for free
0