Generation of microsatellite repeat families by RTE retrotransposons in lepidopteran genomes

67Citations
Citations of this article
78Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Background. Developing lepidopteran microsatellite DNA markers can be problematical, as markers often exhibit multiple banding patterns and high frequencies of non-amplifying "null" alleles. Previous studies identified sequences flanking simple sequence repeat (SSR) units that are shared among many lepidopteran species and can be grouped into microsatellite- associated DNA families. These families are thought to be associated with unequal crossing-over during DNA recombination or with transposable elements (TEs). Results. We identified full-length lepidopteran non-LTR retrotransposable elements of the RTE clade in Heliconius melpomene and Bombyx mori. These retroelements possess a single open reading frame encoding the Exonuclease/Endonuclease/Phosphatase and the Reverse Transcriptase/nLTR domains, a 5' UTR (untranslated region), and an extremely short 3' UTR that regularly consists of SSR units. Phylogenetic analysis supported previous suggestions of horizontal transfer among unrelated groups of organisms, but the diversity of lepidopteran RTE elements appears due to ancient divergence of ancestral elements rather than introgression by horizontal transfer. Similarity searches of lepidopteran genomic sequences in GenBank identified partial RTE elements, usually consisting of the 3' terminal region, in 29 species. Furthermore, we identified the C-terminal end of the Reverse Transcriptase/nLTR domain and the associated 3' UTR in over 190 microsatellite markers from 22 lepidopteran species, accounting for 10% of the lepidopteran microsatellites in GenBank. Occasional retrotransposition of autonomous elements, frequent retrotransposition of 3' partial elements, and DNA replication slippage during retrotransposition offers a mechanistic explanation for the association of SSRs with RTE elements in lepidopteran genomes. Conclusions. Non-LTR retrotransposable elements of the RTE clade therefore join a diverse group of TEs as progenitors of SSR units in various organisms. When microsatellites are isolated using standard SSR enrichment protocols and primers designed at complementary repeated regions, amplification from multiple genomic sites can cause scoring difficulties that compromise their utility as markers. Screening against RTE elements in the isolation procedure provides one strategy for minimizing this problem. © 2010 Tay et al; licensee BioMed Central Ltd.

Figures

  • Figure 1 Heliconius melpomene HmRTE-e01 non-LTR retrotransposable element identified from a BAC clone (GenBank:CU462842). Characteristics of the full-length HmRTE-e01 element identified from the Heliconius melpomene BAC clone AEHM-22C5 (GenBank:CU462842). The element is inserted in the minus strand of the BAC clone from nucleotide position 30,546 to 27,284. It has a single open reading frame that encodes a 990 amino acid protein sequence. AP marks the Exonuclease/Endonuclease/Phosphatase domain and RT indicates the RT_nLTR_like domain. The element has a 276 bp 5' untranslated region (UTR) and a 14 bp short 3' UTR that includes the (GAA)2GA simple repeat units represented by diagonal stripes. The stop codon (TAA) located at 27,300 - 27,298 and is indicated by '*'. Immediately flanking the HmRTE-e01 element are two 20 bp target-site duplication (TSD, represented by filled dark boxes) sequences of 'AGATATACTTCGTTTAAACT'.
  • Figure 2 Phylogenetic analysis of non-LTR retrotransposable elements including novel H. melpomene and B. mori RTEs. Phylogenetic analysis of non-LTR retrotransposable elements RT conserved domain from different clades as reported in Figure 3 of Malik et al. [23], and included also the recently described B. mori non-LTR CR1B element of the CR1 clade [24]. The tree was constructed using the Neighbour Joining (NJ) method as described in Malik et al. [23] with the CRE element RT conserved domain as outgroups. The NJ tree is a 50% consensus tree, with bootstrap values of >70 from 2,000 bootstrap replications indicated at respective nodes. The RTE clade includes previously described RTE-1, RTE-2, JAM1 and BDDF [23] as well as 25 newly identified elements from B. mori and one from H. melpomene. With the exception of BmCR1B which was obtained from [24], all amino acid sequences from the RTE, R2, R4, L1, Jockey, CR1 and CRE clades of Malik et al. [23] were from their sequence alignment (EMBL:DS36752).
  • Figure 3 Neighbour-joining RTE clade phylogenetic tree. The NJ tree is a 50% consensus tree, with bootstrap values of >70 from 2,000 bootstrap replications indicated at respective nodes. Alignment of complete RT conserved domain used the Kalign sequence alignment program [51,52] in EMBL-EBI. The Neurospora Group II intron (GenBank: S07649) was used as the outgroup. 14 representative BmRTE sequences used in Figure 2 have been included, along with newly described RTE elements as listed in the Methods, indicated with an asterisk. The RTE elements were broadly clustered as reported in Figure 5 of [22] although the higher number of BmRTEs identified and included in this analysis, and together with the lower numbers of Bov-B LINEs included have altered the tree topology. Overall, the elements were grouped into four sister groups of animal/plant RTE, Rex3/RTE, BovB LINE/RTE, and Caenorhabditis/Bombyx RTE. Although basal to the Plant/Animal RTE and Rex3/RTE subgroups, the positions of JAM1 and SR2 remained uncertain [22] due to the lower (<70%) confidence values at the respective nodes. Two nodes representing horizontal transfer events proposed by Zupunski et al. [22] are indicated. (A) from plants to some fishes, (B) from arthropods to reptiles and then to ruminant mammals. Note that the medaka fish Oryzias has both a Rex3 RTE element similar to other fishes, and a plant-like RTE element.
  • Figure 4 Sequence alignment between HzRTE-1-1, the LSCS 1, selected lepidopteran microsatellite DNA loci and gDNA sequences. Alignment between (1) the partial Helicoverpa zea HzRTE-1 element ([25], minus bases 996 to 2,632), (2) the Lepidoptera Core Specific Sequence 1 (LSCS1) of van't Hof et al. [6], and selected examples of lepidopteran GenBank entries (3 - 11). HaD47 (GenBank:AY497338, [9]), HarSSR3 (GenBank:AJ504787, [8]), HarSSR7 (GenBank:AJ627416, [7]) are H. armigera microsatellite DNA markers (3 - 5), (6) HzMS1-6 (GenBank:EF152206, [28]) and (7) BA-ATG230 (GenBank:DQ225294, [6]) are markers from H. zea and Bicyclus anynana respectively. (8) and (9) are identical HaRTE elements (HaRTE-t01, Additional File 3) from introns in a cadherin gene of H. armigera (GenBank:AY714875 and GenBank:AY714876). (10) belongs to a 224 bp long partial RTE element in Trichoplusia ni (GenBank:U46130), and is located within the 2nd intron of the Preproattacin A gene from positions 908 to 1,131 (nucleotides 920 to 1,028 not shown). (11) is a partial 442 bp long RTE element in B. mori (GenBank:AB262389) and is located within positions 89,594 to 90,035 (nucleotides 89,606 to 89,930 not shown). TSD sequences are underlined, unique flanking sequence is shown in lower case. Bases identical to the HzRTE-1 sequence are denoted by dots, small gaps inserted for alignment purposes are indicated by dashes, and large gaps in the sequence are represented by '//'.
  • Figure 5 Protein sequence alignments of translations of selected lepidopteran microsatellite loci and various BmRTEs 3' termini. Five examples of lepidopteran microsatellite loci with significant amino acid similarity to the translated C-terminal amino acid sequences of various Bombyx mori RTE elements are shown. Stop codons adjacent to microsatellite repeat units and gaps inserted for alignment purpose are indicated by '*' and '-' respectively. Nucleotide positions are numbered according to GenBank entries of microsatellite DNA loci. Amino acid residue mismatches are indicated in blue, microsatellite DNA SSR units are underlined. Identity and E-values obtained by stand-alone blastx search against the 25 BmRTEs. Microsatellite DNA loci are: AEP078 (DQ380851, Arhopala epimuta, 66% identity to BmRTE-d01, E-value = 7e-13), HarSSR7 (AJ627416, Helicoverpa armigera, 80% identity to BmRTE-d02, E-value = 1e-04), DTH081 (DQ380790 reverse complemented, Drupadia theda, 62% identity to BmRTE-d04, E-value = 2e-11), BFU068 (DQ393655, Busseola fusca, 72% identify to BmRTE-d08, E-value = 6e-14), and Hm02 (DQ020073, Heliconius melpomene, 72% identity to BmRTE-d13, Evalue = 2e-05).
  • Table 1: Examples of RTE insertions in lepidopteran microsatellite loci lacking SSR units at the 3' UTR.
  • Figure 6 Non-allelic size variants of Helicoverpa armigera microsatellite DNA locus HaD47. HaD47 microsatellite locus [9] non-allelic size variants in two Helicoverpa armigera (AD1, AD2). Size variants of 140 bp (AD1) and 139 bp (AD2) are most similar to the HaD47 published allele (142 bp, nucleotides 131 to 272; AY497338). The partial non-LTR RTE includes the HaD47 reverse primer (R T) and the SSR units are indicated, and the forward primer ( F) is on the host genome. Unknown host genomic sequences of the non-allelic size variants including length (in bp) are indicated by '?'.

References Powered by Scopus

Gapped BLAST and PSI-BLAST: A new generation of protein database search programs

63269Citations
N/AReaders
Get full text

Microsatellite null alleles and estimation of population differentiation

2329Citations
N/AReaders
Get full text

Simple sequences are ubiquitous repetitive components of eukaryotic genomes

1047Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Insect Resistance to Bacillus thuringiensis Toxin Cry2Ab Is Conferred by Mutations in an ABC Transporter Subfamily A Protein

158Citations
N/AReaders
Get full text

The role of transposable elements in speciation

130Citations
N/AReaders
Get full text

Spodoptera frugiperda (Lepidoptera: Noctuidae) host-plant variants: two host strains or two distinct species?

127Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Tay, W. T., Behere, G. T., Batterham, P., & Heckel, D. G. (2010). Generation of microsatellite repeat families by RTE retrotransposons in lepidopteran genomes. BMC Evolutionary Biology, 10(1). https://doi.org/10.1186/1471-2148-10-144

Readers over time

‘10‘11‘12‘13‘14‘15‘16‘17‘18‘19‘20‘21‘22‘23‘24036912

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 30

45%

Researcher 28

42%

Professor / Associate Prof. 8

12%

Readers' Discipline

Tooltip

Agricultural and Biological Sciences 55

85%

Biochemistry, Genetics and Molecular Bi... 8

12%

Pharmacology, Toxicology and Pharmaceut... 1

2%

Computer Science 1

2%

Save time finding and organizing research with Mendeley

Sign up for free
0