Bacterial repetitive extragenic p...
BioMed Central Page 1 of 12 (page number not for citation purposes) BMC Genomics Open Access Research article Bacterial repetitive extragenic palindromic sequences are DNA targets for Insertion Sequence elements Raquel Tobes* and Eduardo Pareja Address: Bioinformatics Unit, Era7 Information Technologies SL, BIC Granada CEEI, Parque Tecnol��gico de Ciencias de la Salud ��� Armilla Granada 18100, Spain Email: Raquel Tobes* - rtobes@era7.com Eduardo Pareja - epareja@era7.com * Corresponding author Abstract Background: Mobile elements are involved in genomic rearrangements and virulence acquisition, and hence, are important elements in bacterial genome evolution. The insertion of some specific Insertion Sequences had been associated with repetitive extragenic palindromic (REP) elements. Considering that there are a sufficient number of available genomes with described REPs, and exploiting the advantage of the traceability of transposition events in genomes, we decided to exhaustively analyze the relationship between REP sequences and mobile elements. Results: This global multigenome study highlights the importance of repetitive extragenic palindromic elements as target sequences for transposases. The study is based on the analysis of the DNA regions surrounding the 981 instances of Insertion Sequence elements with respect to the positioning of REP sequences in the 19 available annotated microbial genomes corresponding to species of bacteria with reported REP sequences. This analysis has allowed the detection of the specific insertion into REP sequences for ISPsy8 in Pseudomonas syringae DC3000, ISPa11 in P. aeruginosa PA01, ISPpu9 and ISPpu10 in P. putida KT2440, and ISRm22 and ISRm19 in Sinorhizobium meliloti 1021 genome. Preference for insertion in extragenic spaces with REP sequences has also been detected for ISPsy7 in P. syringae DC3000, ISRm5 in S. meliloti and ISNm1106 in Neisseria meningitidis MC58 and Z2491 genomes. Probably, the association with REP elements that we have detected analyzing genomes is only the tip of the iceberg, and this association could be even more frequent in natural isolates. Conclusion: Our findings characterize REP elements as hot spots for transposition and reinforce the relationship between REP sequences and genomic plasticity mediated by mobile elements. In addition, this study defines a subset of REP-recognizer transposases with high target selectivity that can be useful in the development of new tools for genome manipulation. Background The term "REP sequences" encompasses repetitive and palindromic sequences with a length between 21 and 65 bases [1] detected in the extragenic space of some bacterial genomes. The function of REP elements is not completely determined but there are important processes in which REP sequences are involved. It was proposed that REP sequences play a role as transcriptional attenuators [2] although it was later stated that REP sequences are not specific terminators [3]. Based on their role as mRNA sta- Published: 24 March 2006 BMC Genomics 2006, 7:62 doi:10.1186/1471-2164-7-62 Received: 23 August 2005 Accepted: 24 March 2006 This article is available from: http://www.biomedcentral.com/1471-2164/7/62 �� 2006 Tobes and Pareja licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
BMC Genomics 2006, 7:62 http://www.biomedcentral.com/1471-2164/7/62 Page 2 of 12 (page number not for citation purposes) bilizers [4], it has also been suggested that REP elements are involved in the fine tuning of gene-expression [5]. REP sequences are binding sites for DNA polymerase I [6], for DNA gyrase [7], and for Integration Host Factor (IHF) [8], all of which play a key role in bacterial DNA physiology. There are also some cases in which REP sequences appear as targets for transposition and recombination events. In this sense, it has been shown that IS1397 and IS621 insert specifically within REP sequences of Escherichia coli and that ISKpn1 insert into REP sequences of Klebsiella pneu- moniae [9-12]. REP sequences also appear at the recombi- nation junctions of lambda bio phages [13] and amplification of plasmid F_128 is initiated by REP-REP recombination [14]. REP elements and binding sites for global regulators share common features such as size, palindromic structure, and multiple locations in the extragenic space of the genomes. The DNA binding sites for global regulators are placed in multiple sites along the genome, far from their corre- sponding genes. This fact makes it difficult to detect all the binding sites corresponding to a global regulator without a specific definition of their binding sequence on the DNA. However, in the case of transposases, each DNA binding-site is placed around the insertion point of the mobile element. Hence, each transposition event stays registered on the genome, allowing the tracing of the last DNA sites bound by each transposase. Considering that there were a sufficient number of available genomes cor- responding to organisms with a described presence of REP, and exploiting the advantage of the traceability of transposition events in genomes, we decided to analyze the relationship between REP sequences and mobile ele- ments. We have carried out an exhaustive study of all the insertion sites of mobile elements in the genomes with REP elements. This analysis has allowed us to detect that REP sequences are specific targets of insertion for IS ele- ments in the genomes of Pseudomonas syringae pv. tomato DC3000, Pseudomonas aeruginosa PA01, Pseudomonas put- ida KT2440, Sinorhizobium meliloti 1021, and a probable association in Neisseria meningitidis MC58 and Neisseria meningitidis Z2491. Results Analyzing the results obtained in our study of the associ- ation between REP sequences and mobile elements, we have distinguished two types of associations: (i) type 1 association, in which the percentage of association is 100% and each IS copy is inserted in the same position of a REP sequence, making it possible to define the DNA tar- get consensus sequence (Tables 1 and 2 and Figure 2) and (ii) type 2 association, in which the IS elements are near to, or adjacent to REP sequences, but fragments of broken REP sequences just flanking IS elements are not detected (Tables 1 and 2). We have detected a type 1 association for ISPsy8 in P. syringae DC3000, for ISPa11 in P. aeruginosa PA01, for ISPpu9 and ISPpu10 in P. putida KT2440, and for ISRm22 and ISRm19 in S. meliloti 1021 genome. Figure 1 shows IS elements flanked by the two fragments of the broken REP sequences, and the alignments of the reconstructed REP sequences corresponding to the insertion sites are shown in Figure 2. In addition, the alignments of the complete sequences of each IS element, including their flanking regions, are in the additional material [see Additional file 1, file 2, file 3, file 4, file 5 and file 6]. Remarkably, in all cases of type 1 association, 100% of the copies of each IS are associated to REP (Tables 1 and 2), proving a high selectivity for their REP sequence target. The results of this sequence analysis allow us to affirm that REP elements are target sequences for transposases. There are five ISPsy8 elements in the P. syringae DC3000 genome and in all cases, their insertions were into a REP sequence. ISPsy8 always broke the REP element at exactly the same point of the sequence, generating a direct repeat of three bases (Figure 1) [see Additional file 1]. A con- served arrangement that consists of a fragment of the REP sequence, a direct repeat of three bases, the left end of the ISPsy8, the transposase OrfA, the transposase OrfB, the right end, the other direct repeat, and the remaining frag- ment of REP sequence is maintained in all ISPsy8 inser- tion areas (Figure 1) [see Additional file 1]. In four cases, the broken REP elements are in the minus strand, and in one case, the broken REP element is located in the plus strand (Figure 1). However, in all cases, the transposase ORFs are in the plus DNA strand [see Additional file 1]. The point of insertion within the REP element is exactly between the bases occupying positions 32 and 33 of the REP sequence. All these REP elements share a consensus sequence (Figure 2) adjacent to the ISPsy8 insertion point. A direct repeat of three base pairs, corresponding to posi- tions 33, 34 and 35 of each broken REP sequence, is gen- erated and appears at both extremes of the IS element (Figure 1) [see Additional file 1]. Palindromy can proba- bly induce REP sequences to adopt hairpin secondary structures. Strikingly, the ISPsy8 insertion site is located just at the symmetry axis of one of the two probable hair- pin structures predicted for REP sequence of P. syringae [5] (Figure 2) [see Additional file 1]. The allocation of ISPsy8 into clusters of REP elements was determinant for the detection of REP elements broken at the ISPsy8 inser- tion point. In four cases, ISPsy8 was inserted into a cluster of REP sequences and in one case, it was inserted into an isolated REP element [see Additional file 7]. When we joined the two fragments located at both sides of ISPsy8, the REP sequence appeared perfectly reconstructed (Fig- ure 2) [see Additional file 1]. In the cases where the bro- ken REP sequence formed part of a cluster, its reconstructed sequence was very similar to the REP