Species-specific repetitive extragenic palindromic (REP) sequences in Pseudomonas putida.
- PubMed: 11937637
Pseudomonas putida KT2440 is a soil bacterium that effectively colonises the roots of many plants and degrades a variety of toxic aromatic compounds. Its genome has recently been sequenced. We describe that a 35 bp sequence with the structure of an imperfect palindrome, originally found repeated three times downstream of the rpoH gene terminator, is detected more than 800 times in the chromosome of this strain. The structure of this DNA segment is analogous to that of the so-called enterobacteriaceae repetitive extragenic palindromic (REP) sequences, although its sequence is different. Computer-assisted analysis of the presence and distribution of this repeated sequence in the P.putida chromosome revealed that in at least 80% of the cases the sequence is extragenic, and in 82% of the cases the distance of this extragenic element to the end of one of the neighbouring genes was <100 bp. This 35 bp element can be found either as a single element, as pairs of elements, or sometimes forming clusters of up to five elements in which they alternate orientation. PCR scanning of chromosomes from different isolates of Pseudomonas sp. strains using oligonucleotides complementary to the most conserved region of this sequence shows that it is only present in isolates of the species P.putida. For this reason we suggest that the P.putida 35 bp element is a distinctive REP sequence in P.putida. This is the first time that REP sequences have been described and characterised in a group of non-enterobacteriaceae.
Species-specific repetitive extra...
Species-specific repetitive extragenic palindromic
(REP) sequences in Pseudomonas putida
Isabel Aranda-Olmedo, Raquel Tobes, Maximino Manzanera, Juan L. Ramos and
Consejo Superior de Investigaciones Científicas, Estación Experimental del Zaidín, Departamento de
Bioquímica y Biología Molecular y Celular de Plantas, Apdo. de correos 419, E-18080 Granada, Spain
Received November 9, 2001; Revised and Accepted February 22, 2002
Pseudomonas putida KT2440 is a soil bacterium that
effectively colonises the roots of many plants and
degrades a variety of toxic aromatic compounds. Its
genome has recently been sequenced. We describe
that a 35 bp sequence with the structure of an imperfect
palindrome, originally found repeated three times
downstream of the rpoH gene terminator, is detected
more than 800 times in the chromosome of this
strain. The structure of this DNA segment is analogous
to that of the so-called enterobacteriaceae repetitive
extragenic palindromic (REP) sequences, although
its sequence is different. Computer-assisted analysis
of the presence and distribution of this repeated
sequence in the P.putida chromosome revealed that
in at least 80% of the cases the sequence is extra-
genic, and in 82% of the cases the distance of this
extragenic element to the end of one of the neigh-
bouring genes was <100 bp. This 35 bp element can
be found either as a single element, as pairs of
elements, or sometimes forming clusters of up to five
elements in which they alternate orientation. PCR
scanning of chromosomes from different isolates of
Pseudomonas sp. strains using oligonucleotides
complementary to the most conserved region of this
sequence shows that it is only present in isolates of
the species P.putida. For this reason we suggest that
the P.putida 35 bp element is a distinctive REP
sequence in P.putida. This is the first time that REP
sequences have been described and characterised in
a group of non-enterobacteriaceae.
Pseudomonads are able to metabolise an enormous range of
natural and synthetic organic compounds. They play a crucial
role in the process of mineralisation and recycling of organic
matter in nature (1). Their high adaptation capacity and their
catabolic potential have instigated biochemical and genetic
studies of different species of the genus Pseudomonas over the
years (2). Among them, the soil bacterium Pseudomonas
putida has deserved special attention due to its potential appli-
cations in biotechnology for the control of environmental
pollution, promotion of plant growth and control of pathogens
Pseudomonas putida mt-2 (ATCC 33015) was first
described by Stanier and co-workers (1) and was later shown to
carry two different pathways for benzoate metabolism (4).
Pseudomonas putida KT2440 (5) is a plasmid cured sponta-
neous rmo– derivative of P.putida mt-2, isolated to allow
genetic analysis and manipulation. This strain has been exten-
sively characterised both physiologically, biochemically and
genetically, so that it is currently considered a representative
strain of the species P.putida. Its genome was the subject of an
early sequencing programme, through which the sequence of its
chromosome has recently been determined and is currently in the
process of annotation (www.tigr.org).
Repetitive extragenic palindromic (REP) elements were first
described in Escherichia coli as 35 bp sequences composed of
a highly conserved inverted repeat with the potential of
forming a stem–loop structure (6,7). The sequences have been
extensively characterised in E.coli and Salmonella typhimurium
where they are present in the chromosome more than 500 times
either as single independent units or as part of different types
of clusters, i.e. bacterial interspersed mosaic elements (BIMEs)
(8). Similar sequences were found in Klebsiella pneumoniae and
other enterobacteria (9). To date, REP sequences have not been
described in other prokaryotic microorganisms.
We previously characterised in detail the rpoH gene of
P.putida, which is convergent with the mtgA gene. Analysis of
the 198 bp intergenic gene region revealed the presence of a 35 bp
inverted repeat element located 13 bp downstream of the
rho-independent terminator of the rpoH gene. This 35 bp
sequence was repeated three times in this intergenic region. In
the present study, we have carried out BLAST scans of the
available genome sequence of the strain and we have found
that this element is repeated more than 800 times in the
chromosome, with a high degree of sequence conservation.
We have experimentally determined that this specific 35 bp
DNA sequence is restricted to strains of the species P.putida.
The possible function of this type of sequences is discussed.
*To whom correspondence should be addressed. Tel: +34 958 121011; Fax: +34 958 129600; Email: email@example.com
The authors wish it to be known that, in their opinion, the first three authors should be regarded as joint First Authors
each type of intergenic space; and (vii) distances between the
REP sequence pairs and the end of the convergent genes
limiting an intergenic space. The programs are written in
BASIC and will be made available by the authors upon
request. A version written in C is currently being developed.
Identification of a repetitive element in the P.putida
In the genome of P.putida KT2440 the rpoH gene is followed
by a 34 bp inverted repeat sequence that forms a hairpin and
has rho-independent terminator activity (25). This sequence is
within the 198 bp-long intergenic sequence that separates rpoH
from its neighbouring convergent mtgA gene. Detailed analysis
of this 198 bp intergenic region revealed that in addition to the
hairpin structure, three copies of a well conserved 35 bp
sequence located at 13, 57 and 102 bp from the terminator
sequence was present: these were organised as a direct unit
followed by two copies in the opposite orientation (Fig. 1A).
The sequence within each unit was partially palindromic,
containing an internal 6 bp inverted repeat. An initial search
for homology with the intergenic rpoH/mtgA region against
DNA sequences deposited in GenBank revealed almost 30
sequences with significant hits in which the conserved
sequence corresponded to the stretch of 35 bp. All these
sequences belonged to strains of the species P.putida or to
unidentified Pseudomonas strains. The alignment of these
sequences revealed a conserved internal inverted repeat
5′-GCGGGN4CCCGC-3′. The element seemed to be wide-
spread in sequences deposited in GenBank for genes belonging
to bacteria of the genus Pseudomonas. Consequently, we
decided to analyse in detail the presence of this sequence in the
recently completed genome of P.putida KT2440 (www.tigr.com).
We selected the 35-base sequence 5′-CCGGCCTCT-
TCGCGGGTAAGCCCGCTCCTACAGGG-3′ located down-
stream of the rpoH gene as a query sequence for a BLAST
search in the whole P.putida KT2440. The BLAST available at
TIGR had limited the number of sequences with hits in the
output. Hence, we used a locally executable BLAST without
this limitation. A preliminary analysis of the detected
sequences allowed us to establish their main features: (i) a
length of 35 nt; (ii) the presence of the central palindromic motif
GCGGGnnnnCCCGC; and (iii) a dispersed similarity along
the 35 bases.
The BLAST parameter ‘W’ defining the size of word for the
initial matching was 11, which is the default value. We realised
that it was necessary to set this parameter to 7 to detect a
pattern with these characteristics. As only the central motif was
totally conserved, the parameter penalising the mismatch (q)
was also changed from –3 to –2, and since the central palindromic
motif did not allow gaps, the BLAST search was done without
gaps. From this BLAST search with these parameters in the
complete genome of P.putida we obtained a set of around 1300
Multiple alignment of the sequences was carried out and we
established a filter of significance (E) and length for this set.
We selected sequences longer than 19 bases with an E-value
<1. This reduced the number of sequences to 804 (Fig. 2 shows
the alignment of the best 30 sequences matching the query) and
allowed for definition of the following consensus sequence:
letters indicate the presence of this base in this position in 90%
of the aligned sequences and lowercase letters in 50% of the
sequences) with a central palindromic motif (underlined). The
central dyad symmetry was maintained in most of the
sequences through compensatory base changes. In 98% of the
cases, the change GGGt to GGGa was counterbalanced by the
change aCCC to tCCC (data not shown). We evaluated the
relevance of this palindromic structure with the rationale that,
Figure 1. (A) DNA sequence and features between the rpoH and mtgA genes.
The STOP codons of rpoH and mtgA are indicated. The rho-independent termina-
tor sequence is in italics and underlined. Arrows underline the three repetitive
sequences found in this intergenic region. The direction of the arrow indicates
the orientation of the repetitive sequence. (B) Schematic representation of the
intergenic region. The orange, blue and green arrows indicate the orientation of
the three different REP sequences found downstream of rpoH. T indicates the
rho-independent terminator. (C) Termination activity of sequences located
downstream the rpoH gene. Pseudomonas putida KT2440 carrying the indi-
cated plasmids was grown until exponential phase was reached, and β-galac-
tosidase was determined as described in Materials and Methods. Data are
presented as percentage of the activity measured in cells bearing pMPR (400
Miller units). The leftmost arrow (black) depicts the λp′R promoter, the white
arrow indicates the ′lacZ reporter gene, while the orange, blue and green arrows
between the λp′R promoter and ′lacZ show the different REP sequences found
downstream of rpoH. Striped arrows show a random sequence, in either orienta-
tion, and the hairpin indicates the rho-independent terminator. Data are the
mean of three independent assays, with SD <10% of the given value.
if the secondary structure was maintained through compensatory
mutations, these mutations would appear at a higher frequency
than expected. When we analysed the 804 REP sequences, we
found 85 base changes on the left side of the central palindrome
GCGGGN4CCCGC, and 106 changes on the right side. Among
these changes, 21 were symmetrical and complementary. Given a
mutation on the left side of the palindrome, we estimated the
probability of finding a compensatory mutation on the other
side of the palindrome (pcm) as a factor of the probability of
finding a mutation in the symmetrical position (106/804 × 5),
and the probability of the change being complementary (1/3).
Therefore, the expected frequency of complementary mutations
would be 85 × pcm = 0.75. However, the observed frequency of
complementary mutation was 21, 28-fold of what was
expected. These data reinforce the importance of conserving
the palindromic structure, and allow us to suggest that the
secondary structure is conserved through the selection of
compensatory mutations. Because of the structure, abundance,
degree of conservation and parallelism of these sequences with
the REP sequences previously described in enterobacteriaceae,
we believe we have found for the first time REP sequences in
Pseudomonas putida REP sequences are favoured in
extragenic spaces between convergent genes
To detect the extragenic or intragenic allocation of the
sequences, the distance between two adjacent sequences and
the distance to the flanking genes, we developed computer-
assisted methods to analyse these sequences in the context of
the ORFs in the genome of P.putida KT2440. We found that
∼80% of these REP sequences were found in extragenic
regions. In 89% of cases, the intergenic region was <500 bases
in length (Fig. 3A). Taking into account that only 11% of the
genome bases of P.putida are extragenic (R.Tobes, unpub-
lished data), this location of REP sequences in extragenic spaces
is not likely to be random. We performed a chi-square analysis
to test the randomness of this distribution. We concluded that
for each base, the intra-REP location and the extragenic loca-
tion were related features with a probability close to 1 (since
the P-value of no relation is 0.000000), thus suggesting that the
presence of REP sequences in intergenic spaces is not random.
This could reflect a selection against the appearance of these
elements within coding regions, or a positive selection for the
presence of REP sequences in extragenic spaces.
We also analysed the distance between REP sequences
within the same intergenic space. In all cases where more than
one REP sequence was present, they were separated by <51 bases
and always appeared in opposite orientation. When there are
several REP sequences within an intergenic region, these
sequences form a cluster. We detected 225 isolated REP
sequences, 372 REP sequences forming pairs, 36 forming clusters
of three sequences, 12 in clusters of four sequences and one
cluster of five sequences (Fig. 3B). This mode of organisation
has also been described in the E.coli chromosome where some
REP sequences are organised as complex elements called
BIMEs (28). In P.putida, when clustered as pairs, REP
sequences were always found in opposite orientation separated
by short stretches of sequence, opening the possibility of
Figure 2. Sequence alignment of the 30 best P.putida REP sequences matching
the REPa1 located downstream rpoH. Numbers are those assigned to each REP
sequence in our database. Letters in red are bases present at this position in
90% of the 804 sequences. Letters in blue are bases present at this position in
50% of the 804 sequences. The palindromic motif is underlined.
Figure 3. (A) Size distribution of intergenic spaces in P.putida KT2440.
Intergenic spaces were grouped in four intervals, according to their size. The
figure shows the frequency for each size interval. (B) REP sequence distribution in
P.putida KT2440. In P.putida KT2440 REP sequences are distributed as isolated
elements or forming clusters. We have represented the frequency of REP
sequences that appear isolated or forming clusters of two, three, four or five
secondary structure formation. This would involve the co-
evolution of the sequences in a pair, and would imply that the
similarity between two elements in a pair should be higher than
the similarity between randomly selected pairs of REP
sequences. To test this hypothesis, we performed the following
analysis. We selected all the pairs of REP sequences (a pair is
defined as two inverted REP sequences located in the same
intergenic space and <100 bp apart from each other) and ran a
BLAST (Bl2seq) between both elements of each pair. The
average E-value obtained was 0.00278. Considering the group
of 804 REP sequences, there are 322 806 different possible
pairs of REP sequences. We randomly selected 120 967
REP pairs of the 322 806 possible different pairs of the 804
REP sequences and ran a BLAST (Bl2seq) as before. The
average E-value obtained in this case was 0.01179, which is a
value 4.241 times higher than the average E-value between
REP sequence pairs. This suggests a selective pressure
favouring similarity between REP sequences in a pair.
With the aim of searching for a relationship between the
presence of intergenic REP sequences and the orientation of
the genes limiting the intergenic regions, we defined four types of
intergenic spaces: (i) between convergent genes; (ii) between
divergent genes; (iii) between genes positively oriented (coded
by the main DNA chain); and (iv) between genes negatively
oriented (coded by the complementary DNA chain) (Table 1).
If the REP sequences were randomly allocated in the different
types of intergenic spaces, their distribution would depend on
the number of intergenic spaces of each type. However, the
number of REP sequences between convergent genes is strikingly
2-fold higher than the expected value. Moreover, the presence
of REP sequences between divergent genes is less than half of
the expected value (Table 1). It is worth noting that very
similar results were obtained when only REP sequences that
appear as isolated elements were analysed. These data suggest
that the REP sequences are preferentially localised in inter-
genic regions limited by convergent genes, their presence
between divergent genes being avoided. Given that this feature
seemed to us essential in P.putida REP sequences because of
putative functional implications, we wondered if other known
REP sequences, i.e. E.coli REP sequences, would share
this characteristic. With this aim, we performed a similar
analysis of E.coli REP sequences based on data available at
and the E.coli genome annotation available at www.ncbi.nlm.nih.gov/
Strikingly, we also found that for E.coli the observed frequency
of REP sequences between convergent genes was 3-fold the
expected value (Table 2).
To search for a putative relationship between every inter-
genic REP sequence and its two flanking genes, we analysed
the distances of each REP sequence to its neighbouring gene
considering their START (5′-terminus) or STOP (3′-terminus)
codon. Since the frequency of REP sequences between conver-
gent genes was the highest, we could define 829 distances to a
STOP codon and 424 distances to a START codon. Figure 4
shows a graphic presentation of the frequency of each distance,
ordered by sizes. Clearly, in the set of distances to a STOP
codon, the values were grouped around a median value of 56 bp,
where the most frequent value (mode) was 15 bp. Of the
values, >80% were <92 bp (Fig. 4). In contrast, when we
analysed the distances to a START codon, the median was
Table 1. Statistical analysis of the organisation of the REP sequences that
appear in intergenic spaces
aTypes of intergenic spaces depending on gene orientation, which is schematised
bNumber of intergenic spaces of each type in the genome of P.putida.
cREP frequencies: number of REP sequences in each type of intergenic space.
dExpected frequencies of REP sequences in each type of intergenic space.
eRatio between REP frequencies and expected frequencies.
Intergenic space typesa NSb RFc EFd RF/EFe
→•→ 1281 209 219.75 0.95
←•← 1111 155 190.59 0.81
→•← convergent 719 236 123.34 1.91
←•→ divergent 678 50 116.31 0.42
Total 3789 650
Table 2. Statistical analysis of the organisation of the E.coli BIME sequences
that appear in intergenic spaces
aTypes of intergenic spaces depending on gene orientation, which is schematised
bNumber of intergenic spaces of each type in the genome of E.coli.
cBIME frequencies: number of BIME sequences in each type of intergenic
dExpected frequencies of BIME sequences in each type of intergenic space.
eRatio between BIME frequencies and expected frequencies.
Intergenic space typesa NSb BFc EFd BF/EFe
→•→ 1113 48 74.91 0.64
←•← 1195 56 80.43 0.69
→•← convergent 610 126 41.05 3.06
←•→ divergent 528 2 35.54 0.05
Total 3447 232
Figure 4. Distance of the intergenic REP sequences to the beginning and to the
end of a gene. Each intergenic space is limited by two genes. The orientation
of these genes determines a START codon or a STOP codon as the limit of
each extreme of the intergenic space. For each intergenic REP sequence we ana-
lysed its distance to the limits of the intergenic space to generate two sets of
data: (i) distances of REP sequences to STOP codons limiting its intergenic
space (829 distances); and (ii) distances of REP sequences to START
codons limiting its intergenic space (424 distances). The frequency distribu-
tion of both data sets is presented.
significantly higher (128 bp). Furthermore, two modes were
found, 34 and 71 bp, and only 40% of the distances to a
START codon were <100 bp (Fig. 4). These results suggest
that the REP elements in P.putida KT2440 are probably related
to the neighbouring gene(s), and their function would be
exerted through their position at the end of a gene.
When we specifically analysed the pairs of REP sequences
located between convergent genes, we found that their
distances to the ends of these genes seemed to be maintained
<30 bp: 84% of the distances to the end of the right gene and
68% of the distances to the end of the left gene were <30 bases.
This fact probably has functional implications.
REP sequences are not specific transcription terminators
Because of the partial palindromic structure and the location of
these sequences at the end of ORFs, we decided to test whether
they played a role in transcription termination. We cloned the
35 bp element and all possible combinations found down-
stream rpoH in different orientations between the λp′R
promoter and ′lacZ, in the plasmids detailed in Materials and
Methods, to determine whether they exhibited transcriptional
termination activity. We found that while the true rho-independent
terminator of rpoH lowered the activity to ∼5% (pMPRH), the
presence of the 35 bp sequence in either orientation down-
stream λp′R had little effect on the β-galactosidase activity, and
the two sequences in inverted orientation only reduced 45% of
the β-galactosidase with respect to the construction without the
insert (Fig. 1C). To test whether this low termination activity
was sequence specific (i.e. exclusive of complementary REP
pairs) or due to a putative secondary structure formed between
two inverted sequences of this length, we constructed
pMPRA2 and pMPRA3, similar to pA12 and pA21, but with two
identical random sequences in direct or inverted orientation,
instead of the REP sequences. Figure 1C shows that two
random sequences in inverted orientation produce the same
effect as two REP sequences. These results suggest that the 35 bp
element by itself is probably not a terminator, and that the
ability of the REP sequence pairs to reduce expression of the
downstream gene does not depend on their sequence, but rather
on their inverted orientation. In any case, the effect produced is
very low compared with the termination activity of a true
REP sequences specifically identify P.putida
Escherichia coli REP sequences have been extensively used in
taxonomic studies to determine the diversity of bacterial popu-
lations (29). Repetitive element sequence-based PCR (rep-PCR)
enables the generation of DNA fingerprint patterns to discriminate
bacterial species and strains. The primers more frequently used
for rep-PCR-based fingerprinting analysis are REP, ERIC and
BOX sequences (30). We decided to test whether the REP
elements of P.putida KT2440 could be used in a more specific
manner to specifically detect and genotype P.putida. We
selected several strains belonging to eight species of the genus
Pseudomonas and seven strains of P.putida available in our
laboratory collection. We isolated chromosomal DNA from
each strain and performed PCR using an oligonucleotide with
the P.putida REP consensus sequence. Figure 5 shows that
only P.putida strains gave PCR products under these condi-
tions, whereas none of the non-P.putida strains gave any
signal, even when using low stringency annealing conditions.
These results suggest that the REP sequence of P.putida
KT2440 identified here allows the identification of P.putida
strains. The band pattern obtained with all P.putida strains
revealed common bands, and several strain-specific bands.
This probably reflects genome reorganisations, and allows a
very specific genotyping of strains. A genotyping method
based on species-specific REP sequences would only require
the detection of the presence or absence of PCR products,
rather than the more complex analysis of band patterns. Such a
method would be suitable for the automation of genotyping
The analysis presented in this work shows the presence of
highly repetitive extragenic palindromic sequences in the chromo-
some of P.putida KT2440. The structure of the REP sequence
detected in P.putida is very similar to that described in E.coli
and several enterobacteriaceae such as Salmonella sp. and
Klebsiella sp., although the DNA sequence is species
specific and is dissimilar. In fact, PCR chromosome ampli-
fication with an oligonucleotide complementary to the REP
element has revealed that this particular sequence can only be
detected in strains of the P.putida species. The development of
methods and conditions for the optimal use of this sequence in
the detection, identification and typing of bacterial strains will
make it a valuable tool for taxonomic and population analysis,
both in the laboratory and in natural environments. In addition,
the use of a single primer in the PCR analysis notably increases
the specificity of the reaction, which requires the close presence of
two similar sequences in opposite orientation.
REP sequences of different enterobacteriaceae described so
far share an internal structure: a 35 base sequence with a
central palindromic motif and characteristic positions defining
a head and a tail. In P.putida REP sequences, the most
conserved region is 29 bp long and contains an internal 14 bp
dyad symmetry whose presence tends to be maintained
through compensatory changes. This internal palindrome is
shorter than the imperfect dyad symmetry of E.coli REP
sequences, which is 24 bp long.
Figure 5. PCR amplification of Pseudomonas strains genomic DNA with
REPc primer. Equivalent amounts of total chromosomal DNA isolated from
each strain was amplified as described in Materials and Methods and the PCR
products were subjected to agarose electrophoresis.
In our search, 80% of the REP sequences were found outside
ORFs, therefore they did not interfere with gene sequences.
This proportion could be even higher since, for the sake of
simplicity in our analysis, we have considered as intragenic
those REP sequences that overlapped with gene ORFs by only
1 bp. When we analysed the presence of REP sequences close
to homologous genes in E.coli and P.putida, we obtained no
significant similarity in the position of REP sequences. This
suggests that the function of REP sequences is not related to
specific gene functions.
In E.coli, only 83 out of 500 REP sequences are present as
single units, while the rest are part of more complex mosaic
elements, called BIMEs (31). The situation in P.putida seems
to be different: only eight BIMEs were found and most of the
REP sequences were found as single elements or forming
pairs. Although lower than in E.coli, the presence of REP
sequence pairs in P.putida is very significant, and strikingly, in
these pairs the two REP elements were always found in opposite
orientation. This suggests a specific functional role for the
pairs (32). The disposition as inverted repeats opens the possi-
bility of stem–loop structure formation although they do not
seem to be involved in the transcription termination as deter-
mined experimentally in this study. Recently, a role in tran-
scription attenuation has been suggested for E.coli BIMEs
(33). Espéli et al. (33) found a similar level (∼50%) of decrease
in expression of genes located downstream of a pair of inverted
REP sequences. Unfortunately, they did not test the putative
effect of a random DNA sequence with a similar structure.
Our finding of REP sequences in the non-enterobacteriaceae
P.putida supports the hypothesis that these sequences are a
more general phenomenon in bacteria. Therefore, it is tempting
to speculate that this could mean that they are involved in an
important bacterial function(s), though not yet identified. The
presence of these sequences in enterobacteriaceae has been
related to several functions, such as stabilisation of mRNA
(34–36), organisation of the chromosome, insertion of genetic
elements, binding site for proteins such as IHF (37), DNA
polymerase I (38) and DNA gyrase (39). Thorough studies
carried out with this last protein have shown that it is able to
bind and cleave REP sequences (40). In addition, HU stimulates
high affinity binding of gyrase, but inhibits cleavage of the
sequence (41). Yang and Ferro-Luzzi Ames (41) suggested
that REP sequences could be the target for gyrase action on
the chromosome to maintain the appropriate level of negative
supercoiling, and to anchor the chromosome supercoiled
domain loops to each other. However, this hypothesis cannot
explain why most REP sequences are located at the 3′-terminus
of the genes. Changes in DNA supercoiling mediated by DNA
gyrase could play a role in the induction of expression of some
genes and could provide a general way of increasing expres-
sion of genes in the cell in response to environmental stress
(42). Initiation of transcription for many genes is sensitive to
DNA supercoiling. On the other hand, the situation is especially
interesting in the intergenic spaces between convergent genes,
where most of the P.putida REP pairs are located. In those
cases, the distance of the REP pair to the end of the flanking
genes is usually very short (<30 bp), thus suggesting that their
function is probably related to those genes. It is known that two
neighbouring convergent genes simultaneously transcribed
generate positive supercoiling in the DNA ahead of the two
RNA polymerases moving towards each other, i.e. the intergenic
region (43,44). DNA gyrase, which has been shown to bind
REP sequences in E.coli, is known to relax positively super-
coiled domains (44). Therefore, it is tempting to suggest that
one main role of REP sequences would actually be to allow
DNA gyrase to bind and relax DNA when excessive positive
supercoiling is generated, especially between two convergent
genes. This would explain our finding that most REP pairs are
located between convergent genes in P.putida, close to their
STOP codon, and why this is also the case in E.coli, as shown
in Table 2.
Our results show that the REP sequences appear at a very
low frequency between divergent genes. The DNA region
between the beginning of two genes is a functionally compro-
mised region, because the promoters of at least two genes are
allocated in these regions, where the transcriptional machinery
needs to bind. If REP sequences were the target of cellular
protein, i.e. DNA gyrase, etc., their presence in this region
would interfere with the expression of divergent genes. Also if
REP sequences were targets for insertion of genetic elements
or for DNA rearrangement, as suggested (45,46), they would
be expected to be deleterious if found in promoter regions.
In summary, the finding of REP sequences in non-entero-
bacteriaceae suggests a more general and crucial role for these
sequences. Their significant presence as extragenic, in pairs as
inverted repeats, at the end of genes and between convergent
genes can give clues towards revealing their function. Finally,
the expected broad presence of REP sequences in bacteria
together with their species specificity opens new ways for easy
and precise bacterial genotyping.
Supplementary Material is available at NAR Online.
We thank Ana Hurtado for the preparation of chromosomal
DNAs and T. S. Larsen and I. Cases for preliminary annotation
and suggestions on the handling of the Pseudomonas genome
sequence available at www.tigr.org. This study was supported
by a grant from the European Commission (BIO 2000-3126-CE).
1. Stanier,R.Y., Palleroni,N.J. and Doudoroff,M. (1966) The aerobic
pseudomonads: a taxonomic study. J. Gen. Microbiol., 43, 159–271.
2. Sokatch,J. (1986) The Bacteria. Pseudomonas. Academic Press, New
York, Vol. 10.
3. Ramos,J.L., Diaz,E., Dowling,D., de Lorenzo,V., Molin,S., O’Gara,F.,
Ramos,C. and Timmis,K.N. (1994) The behavior of bacteria designed for
biodegradation. Biotechnology (NY), 12, 1349–1356.
4. Nakazawa,T. and Yokota,T. (1973) Benzoate metabolism in
Pseudomonas putida (arvilla) mt 2: demonstration of two benzoate
pathways. J. Bacteriol., 115, 262–267.
5. Franklin,F.C.H., Bagdasarian,M., Bagdasarian,M.M. and Timmis,K.N.
(1981) Molecular and functional analysis of the TOL plasmid pWWO
from Pseudomonas putida and cloning of genes for the entire regulated
aromatic ring meta cleavage pathway. Proc. Natl Acad. Sci. USA, 78,
6. Higgins,C.F., Ames,G.F., Barnes,W.M., Clement,J.M. and Hofnung,M.
(1982) A novel intercistronic regulatory element of prokaryotic operons.
Nature, 298, 760–762.
7. Stern,M.J., Ames,G.F., Smith,N.H., Robinson,E.C. and Higgins,C.F.
(1984) Repetitive extragenic palindromic sequences: a major component
of the bacterial genome. Cell, 37, 1015–1026.
8. Bachellier,S., Saurin,W., Perrin,D., Hofnung,M. and Gilson,E. (1994)
Structural and functional diversity among bacterial interspersed mosaic
elements (BIMEs). Mol. Microbiol., 12, 61–70.
9. Bachellier,S., Perrin,D., Hofnung,M. and Gilson,E. (1993) Bacterial
interspersed mosaic elements (BIMEs) are present in the genome of
Klebsiella. Mol. Microbiol., 7, 537–544.
10. Moller,S., Pedersen,A.R., Poulsen,L.K., Arvin,E. and Molin,S. (1996)
Activity and three-dimensional distribution of toluene-degrading
Pseudomonas putida in a multispecies biofilm assessed by quantitative
in situ hybridization and scanning confocal laser microscopy. Appl.
Environ. Microbiol., 62, 4632–4640.
11. Huertas,M.J., Duque,E., Molina,L., Rosselló-Mora,R., Mosqueda,G.,
Godoy,P., Christensen,B., Molin,S. and Ramos,J.L. (2000) Tolerance to
sudden organic solvent shocks by soil bacteria and characterisation pf
Pseudomonas putida strains isolated from toluene polluted sites.
Environ. Sci. Technol., 34, 3395–3400.
12. Ramos,J.L., Duque,E., Huertas,M.J. and Haidour,A. (1995) Isolation and
expansion of the catabolic potential of a Pseudomonas putida strain able
to grow in the presence of high concentrations of aromatic hydrocarbons.
J. Bacteriol., 177, 3911–3916.
13. Gibson,D.T., Hensley,M., Yoshioka,H. and Mabry,T.J. (1970) Formation
of (+)-cis-2,3-dihydroxy-1-methylcyclohexa-4,6-diene from toluene by
Pseudomonas putida. Biochemistry, 9, 1626–1630.
14. Esteve-Núñez,A., Lucchesi,G., Philipp,B., Schink,B. and Ramos,J.L.
(2000) Respiration of 2,4,6-trinitrotoluene by Pseudomonas sp. strain
JLR11. J. Bacteriol., 182, 1352–1355.
15. Hofte,M., Mergeay,M. and Verstraete,W. (1990) Marking the
rhizopseudomonas strain 7NSK2 with a Mu d(lac) element for ecological
studies. Appl. Environ. Microbiol., 56, 1046–1052.
16. Aarons,S., Abbas,A., Adams,C., Fenton,A. and O’Gara,F. (2000) A
regulatory RNA (PrrB RNA) modulates expression of secondary
metabolite genes in Pseudomonas fluorescens F113. J. Bacteriol., 182,
17. Collmer,A. and Bauer,D.W. (1994) Erwinia chrysanthemi and
Pseudomonas syringae: plant pathogens trafficking in extracellular
virulence proteins. Curr. Top. Microbiol. Immunol., 192, 43–78.
18. Zumft,W.G., Dohler,K., Korner,H., Lochelt,S., Viebrock,A. and
Frunzke,K. (1988) Defects in cytochrome cd1-dependent nitrite
respiration of transposon Tn5-induced mutants from Pseudomonas
stutzeri. Arch. Microbiol., 149, 492–498.
19. Ramos-González,M.I., Duque,E. and Ramos,J.L. (1991) Conjugational
transfer of recombinant DNA in cultures and in soils: host range of
Pseudomonas putida TOL plasmids. Appl. Environ. Microbiol., 57,
20. Whited,G.M. and Gibson,D.T. (1991) Toluene-4-monooxygenase, a
three-component enzyme system that catalyzes the oxidation of toluene to
p-cresol in Pseudomonas mendocina KR1. J. Bacteriol., 173, 3010–3016.
21. Hanahan,D. (1983) Studies on transformation of Escherichia coli with
plasmids. J. Mol. Biol., 166, 557–580.
22. MacNeil,T., Roberts,G.P., MacNeil,D. and Tyler,B. (1982) The products
of glnL and glnG are bifunctional regulatory proteins. Mol. Gen. Genet.,
23. Sambrook,J., Fritsch,E.F. and Maniatis,T. (1989) Molecular Cloning:
A Laboratory Manual. Cold Spring Harbor Laboratory Press, Cold Spring
24. Abril,M.A., Michán,C., Timmis,K.N. and Ramos,J.L. (1989) Regulator
and enzyme specificities of the TOL plasmid-encoded upper pathway for
degradation of aromatic hydrocarbons and expansion of the substrate
range of the pathway. J. Bacteriol., 171, 6782–6790.
25. Manzanera,M., Aranda-Olmedo,I., Ramos,J.L. and Marqués,S. (2001)
Molecular characterization of Pseudomonas putida KT2440 rpoH gene
regulation. Microbiology, 147, 1323–1330.
26. Miller,J. (1972) Experiments in Molecular Genetics. Cold Spring Harbor
Laboratory Press, Cold Spring Harbor, NY.
27. Corpet,F. (1988) Multiple sequence alignment with hierarchical
clustering. Nucleic Acids Res., 16, 10881–10890.
28. Gilson,E., Saurin,W., Perrin,D., Bachellier,S. and Hofnung,M. (1991)
Palindromic units are part of a new bacterial interspersed mosaic element
(BIME). Nucleic Acids Res., 19, 1375–1383.
29. Louws,F.J., Fulbright,D.W., Stephens,C.T. and de Bruijn,F.J. (1994)
Specific genomic fingerprints of phytopathogenic Xanthomonas and
Pseudomonas pathovars and strains generated with repetitive sequences
and PCR. Appl. Environ. Microbiol., 60, 2286–2295.
30. Louws,F.J., Rademaker,J.L.W. and de Bruijn,F.J. (1999) The three Ds of
PCR-based genomic analysis of phytobacteria: diversity, detection and
disease diagnosis. Ann. Rev. Phytopathol., 37, 81–125.
31. Bachellier,S., Clement,J.M. and Hofnung,M. (1999) Short palindromic
repetitive DNA elements in enterobacteria: a survey. Res. Microbiol., 150,
32. Gilson,E., Perrin,D., Clement,J.M., Szmelcman,S., Dassa,E. and
Hofnung,M. (1986) Palindromic units from E. coli as binding sites for a
chromoid-associated protein. FEBS Lett., 206, 323–328.
33. Espéli,O., Moulin,L. and Boccard,F. (2001) Transcription attenuation
associated with bacterial repetitive extragenic BIME elements.
J. Mol. Biol., 314, 375–386.
34. Newbury,S.F., Smith,N.H., Robinson,E.C., Hiles,I.D. and Higgins,C.F.
(1987) Stabilization of translationally active mRNA by prokaryotic REP
sequences. Cell, 48, 297–310.
35. Newbury,S., Smith,N.H. and Higgins,C.F. (1987) Differential mRNA
stability controls relative gene expression within a polycistronic operon.
Cell, 51, 1131–1143.
36. Stern,M.J., Prossnitz,E. and Ferro-Luzzi Ames,G. (1988) Role of the
intercistronic region in post-transcriptional control of gene expression in
the histidine transport operon of Salmonella typhimurium: involvement of
REP sequences. Mol. Microbiol., 2, 141–152.
37. Engelhorn,M., Boccard,F., Murtin,C., Prentki,P. and Geiselmann,J.
(1995) In vivo interaction of the Escherichia coli integration host factor
with its specific binding sites. Nucleic Acids Res., 23, 2959–2965.
38. Gilson,E., Perrin,D. and Hofnung,M. (1990) DNA polymerase I and a
protein complex bind specifically to E. coli palindromic unit highly
repetitive DNA: implications for bacterial chromosome organization.
Nucleic Acids Res., 18, 3941–3952.
39. Yang,Y. and Ames,G.F. (1988) DNA gyrase binds to the family of
prokaryotic repetitive extragenic palindromic sequences. Proc. Natl Acad.
Sci. USA, 85, 8850–8854.
40. Espéli,O. and Boccard,F. (1997) In vivo cleavage of Escherichia coli
BIME-2 repeats by DNA gyrase: genetic characterization of the target and
identification of the cut site. Mol. Microbiol., 26, 767–777.
41. Yang,Y. and Ferro-Luzzi Ames,G. (1990) The family of Repetitive
Extragenic Palindromic Sequences: interaction with DNA gyrase and
histonelike protein HU. In Drlica,K. and Riley,M. (eds), The Bacterial
Chromosome. American Society of Microbiology, Washington, DC,
42. Ramos,J.L., Gallegos,M.T., Marqués,S., Ramos-González,M.,
Espinosa-Urgel,M. and Segura,A. (2001) Responses of Gram-negative
bacteria to certain environmental stressors. Curr. Opin. Microbiol., 4,
43. Liu,L.F. and Wang,J.C. (1987) Supercoiling of the DNA template during
transcription. Proc. Natl Acad. Sci. USA, 84, 7024–7027.
44. Wu,H.Y., Shyy,S.H., Wang,J.C. and Liu,L.F. (1988) Transcription
generates positively and negatively supercoiled domains in the template.
Cell, 53, 433–440.
45. Bachellier,S., Clement,J.M., Hofnung,M. and Gilson,E. (1997) Bacterial
interspersed mosaic elements (BIMEs) are a major source of sequence
polymorphism in Escherichia coli intergenic regions including specific
associations with a new insertion sequence. Genetics, 145, 551–562.
46. Clément,J.M., Wilde,C., Bachellier,S., Lambert,P. and Hofnung,M.
(1999) IS1397 is active for transposition into the chromosome of
Escherichia coli K-12 and inserts specifically into palindromic units of
bacterial interspersed mosaic elements. J. Bacteriol., 181, 6929–6936.