The RecA Protein: Structure and Biological Function
Heather M. Heerssen(1),
Aaron Downs(3), and David
© David Marcey, 2001
I. Introduction: The
Biological Function of RecA
II. Structural Overview
III. Filament Structure
IV. DNA Binding Regions of RecA
V. ATP Binding Sites
VI. Interfilament Interactions
VII. Coprotease Activity
This exhibit is best viewed if the cue buttons ( ) are pressed in sequence and if the viewer does not independently
manipulate the molecule on the left.
I. Introduction: The Biological Function of
One method by which genetic variation is generated is homologous (or general)
recombination. In this process, two double stranded DNA molecules exchange segments
of DNA at sites of sequence similarity through breakage and rejoining of strands,
leading to recombinant chromosomes with new combinations of alleles at various
genetic loci. Homologous recombination occurs in diverse organisms in different
contexts, including meiotic crossing over in eukaryotes and sexduction in prokaryotes.
According to a model for
homologous recombination, a nick in one of the DNA double helices (catalyzed
by the RecBCD oligomeric protein in E. coli) allows a region of single
stranded DNA to invade a double stranded DNA helix. The intruding strand
displaces one of the helixed strands and binds to the other via Watson-Crick
base pairing. The displaced strand, in turn, hybridizes with the remaining ssDNA.
The regions of DNA involved in the heteroduplexes enlarges through a process
called branch migration. Such "Holliday junctions" are resolved by resolvase
(RuvC in E. coli) and ligase, leading to recombinant chromosomes in 50% of the
The RecA protein is a critical enzyme in this process, as it catalyzes
the pairing of ssDNA with complementary regions of dsDNA. The RecA monomers
first polymerize to form a helical filament around ssDNA. During this process,
RecA extends the ssDNA by 1.6 angstroms per axial base pair. Duplex DNA is then
bound to the polymer. Bound dsDNA is partially unwound to facilitate base pairing
between ssDNA and duplexed DNA. Once ssDNA has hybridized to a region of dsDNA,
the duplexed DNA is further unwound to allow for branch migration. RecA has
a binding site for ATP, the hydrolysis of which is required for release of the
DNA strands from RecA filaments. ATP binding is also required for RecA-driven
branch migration, but non-hydrolyzable analogs of ATP can be substituted for
ATP in this process, suggesting that nucleotide binding alone can provide conformational
changes in RecA filaments that promote branch migration.
In addition to its role in homologous recombination, RecA functions as a coprotease
for the LexA protein. In a healthy cell, LexA represses the expression of genes
encoding DNA repair proteins (SOS genes). Upon injury of DNA, LexA catalyzes
its own digestion, thereby allowing synthesis of necessary SOS proteins. However,
LexA can only induce self-catalysis when activated by a ssDNA-RecA filament.
A single filament will bind and activate several LexA proteins, each of which
then cleaves other bound proteins. Thus, ssDNA-RecA, a product of DNA injury,
stimulates DNA repair.
II. RecA: A Structural
monomer consists of three domains, a large, central domain
, surrounded by relatively small amino
and carboxy domains. The central domain, involved in DNA
and ATP binding, consists primarily of a twisted beta sheet with 8 b-strands
, bounded by 8 a-helices
. The amino domain contains a large a-helix
and short b-strand,
structure being important in formation of the RecA polymer . Three a-helices
and a three-stranded b-sheet
are found in the carboxy domain, which facilitates
III. RecA Filament Structure
Catalysis of homologous
recombination by RecA begins with the formation of a filament composed of RecA
monomers around ssDNA. The RecA filament
wraps around the DNA helically, with 6 monomers per revolution. The RecA
helix is approximately 120 Å wide, with a central diameter of 25 Å.
The carboxy termini of each monomer, which are believed
to be important in interfilament interactions, project outward from the RecA helix.
ATP is bound near the center of the helix. The amino domain of each RecA
monomer is involved in maintaining the RecA polymer bonds. As described in
the structural overview section, this region of the monomer contains a protruding
b unit .
The polymerization of RecA
monomers into filaments involves extensive association of the amino domain of
one monomer and the central domain of the next monomer in the filament (with a
loss of 2,890 Å 2 of solvent-accessible surface area/monomer).
This association can be visualized in a RecA
dimer. Part of the subunit interface involves the packing of the
amino a helix of one monomer
between a complementary a
helix and b
sheet in the central domain of a neighboring monomer.
Thus, the RecA filament has an amino domain-to-central domain polarity. The monomers
are held together by a combination of hydrophobic and electrostatic interactions.
Experimental evidence supports the crystallographic data. Filament formation,
for example, is severely inhibited among RecA monomers in which the amino terminal
has been enzymatically removed. Similarly, proteins consisting only of the amino
portion of RecA prevent polymerization via competitive inhibition of the central
domain binding region. Mutation analyses have been used to identify residues at
the subunit interface critical for RecA polymerization. Monomers in which lysine216
, phenylalanine217 , or arginine222 are replaced by other amino acids are unable to polymerize.
IV. DNA Binding Regions of RecA
The RecA monomer contains two DNA
binding sites in the large central domain, one for binding ssDNA, and the other
for binding duplex DNA. Both DNA binding regions include disordered loops (L1
& L2), containing residues with low electron density in the crystal. These loops,
not shown in the structure, lie close to the filament axis, and therefore are
juxtaposed with DNA. In the views that follow, the loops would project towards
the viewer, i.e. towards the DNA in a RecA-DNA filament (see RecA
Filament Structure, above).
The putative ssDNA binding region includes alpha helix G
as well as L2 (not shown), between glu194
and thr210 . The putative binding site for duplex DNA is found in another disordered
region, L1 (not shown), located between glu156
and gly165 .
Phylogenetic analyses have supported the conclusion that the regions containing
L1 and L2 represent DNA binding regions. Because DNA binding is an essential function
of RecA, the regions of the protein involved in this process should be highly
conserved among bacterial species. Indeed, 10 of the 23 amino acids that compose
the disordered loops are invariant in 16 different RecA proteins. Alpha
helix G, located on the carboxy side of L2, is the most highly conserved
region in the RecA monomer. At the boundary between alpha
helix G and L2 are two invariant glycine residues,
which, due to their small size, could allow maximal interaction between the negatively-charged
sugar-phosphate backbone of the DNA molecule and the positively-charged amine
groups of the helix .
Experimental studies have also confirmed the importance of the disordered regions
in DNA binding. Mutations in several residues in and near the disordered loops
leads to inhibition of DNA binding. In L2, these residues include glycine204,
glutamate207, and glycine211.
In L1, glycine160, glycine157, and arginine169
appear particularly important in binding duplex DNA. Furthermore, photocross-linking
studies have mapped DNA binding to L1 and L2. Finally, a 20-amino acid peptide
containing the L2 sequence is capable of independently binding DNA.
A study by Kumar and colleagues showed that binding of DNA to the RecA protein
causes the disordered loops to assume alpha helical secondary structures. Interestingly,
the amount of alpha helix induced by DNA binding is correlated with its base pair
sequence. Less alpha helical structure is found in RecA proteins bound to CG-rich
oligomers than to DNA fragments abundant in AT sequences. Furthermore, binding
of homologous duplex DNA to ssDNA-RecA generates more alpha helix in the disordered
loops than does binding of heterologous DNA. Thus, induction of alpha helix in
the disordered loops may be a mechanism by which RecA pairs homologous strands.
V. The RecA ATP Binding Site
ATP binds to RecA in the central domain at a phosphate-binding
loop (P-loop), a characteristic ATP binding region found in many proteins
. Two amino acids in the P-loop, lysine72
and threonine73, are known to interact
directly with the phosphate groups of the ATP . Like the DNA binding regions, the P-loop is located on the inner surface
of the RecA filament. Bound ADP can be seen in this model
of RecA. The a-carbons of lysine72
and threonine73 can be seen adjacent to
the phosphates of the nucleotide.
VI. RecA Interfilament Interactions
The carboxy terminus
of each RecA monomer functions in interfilament associations. In forming these
interfilament bonds, the carboxy
terminus of one monomer interacts with an area near the amino
terminus of the neighboring filament . Obviously, these interfilament interactions are critical during the
crystallization of the RecA protein for X-ray diffraction studies. However, associations
between filaments may also be important biologically. In the Tif-1 mutation of
RecA, glutamate38 is changed to lysine and isoleucine298
is converted to valine. Altering the RecA protein in this way prevents
interfilament associations, and increases the efficiency of DNA binding. This
observation suggests that RecA filament bundles form to prevent protein polymerization
around incorrect targets (i.e. dsDNA, RNA), which would induce the SOS response
VII. RecA Coprotease Activity
The LexA repressor is believed to bind to the monomer
in a region on the carboxy side of the central domain
. Mutations in this region affect the ability of RecA to stimulate LexA
autoproteolysis, but not homologous recombination catalyzation activity.
Story, R. M, I. T. Weber,
and T. A. Steitz. 1992. The structure of the E. Coli RecA protein monomer
and polymer. Nature 355: 318-325.
Alberts, B., D. Bray, J. Lewis, M. Raff, K. Roberts, and J. D. Watson. 1994. Molecular
Biology of the Cell, 3rd edition. Garland Publishing, Inc: New York.
Konola, J. T., K. M. Logan, and K. L. Knight. 1994. Functional characterization
of residues in the p-loop motif of the RecA protein ATP binding site. Journal
of Molecular Biology 237: 20-34.
Kumar, K. A., S. Mahalakshmi, and K. Muniyappa. 1993. DNA-induced conformational
changes in RecA protein. Journal of Biological Chemistry 268: 26162-26170.
Malkov, V. A. and R. D. Camerini-Otero. 1995. Photocross-links between single-stranded
DNA and Escherichia coli RecA protein map to loops L1 (amino acid residues
157-164) and L2 (amino acid residues 195-209). Journal of Biological Chemistry
Mikawa, T., R. Masui, T. Ogawa, H. Ogawa, and S. Kuramitsu. 1995. N-terminal 33
amino acid residue of Escherichia coli RecA protein contributes to its
self-assembly. Journal of Molecular Biology250: 471-483.
Skiba, M. C. and K. L. Knight. 1994. Functionally important residues at a subunit
interface in the RecA protein from Escherichia coli. Journal of Biological
Chemistry 269: 3823-3828.
Stryer, L. 1995. Biochemistry, 4th ed. W. H. Freeman and Company: New
Voloshin, O. N., L. Wang, and R. D. Camerini-Otero. 1996. Homologous DNA pairing
promoted by a 20-amino acid peptide derived from RecA. Science 272: 868-872.
Kenyon College, Gambier, Ohio. A first draft of this exhibit was created for D.
Marcey's Molecular Biology class, Biology 63.
2, Kenyon College, Gambier, Ohio. Present address: California Lutheran University.
Address correspondence to this author (see below).
3, Kenyon College, Gambier, Ohio. This author transferred RasMol script files
into the body of the exhibit text.