DNA Replication
Logic of DNA
replication
Double stranded DNA is replicated by the rule of
complementarity: that is, the template base determines the nature of the base
inserted opposite it in the nascent strand (new or daughter strand).
Replication must begin at a finite point (origin of replication; initiation step). It must then proceed
base by base, accurately copying (in the complementary sense) the template
information (elongation step). When
the genome or a certain portion of the genome (a replicon) has been duplicated,
elongation must end (termination step). Proof reading of the copied information
must be done to ensure integrity of the duplicated genomes transmitted to
daughter cells.
The problems faced by
the replication machinery
A. First, there is a topological
problem. The genome is long, and the structure of the vast majority of genomes
is a plectonemically coiled double helix. Since replication involves strand
separation, the strands must rotate around each other in a swiveling motion in
order for the replication machine to move along while copying the bases. The
size of the E. coli genome is 4 million base pairs (4x106 bp) or
400,000 (4x105) turns of DNA. The replication time for E. coli is 40
min. So the rate at which the DNA must unwind is 4x105χ40 = 10,000
(104) turns per minute. Imagine that in a growing E. coli cell, the
DNA is turning at the amazing speed of 160-170 revolutions every second. THAT IS A DIZZYING SPEED! The rate of
replication is a 100,000 bp per minute, 1600-2,000 bp per second, remarkable
for any machine.
B. Second, there is a problem due to
damages present in DNA. For, example, if one strand contains a nick,
replication will convert it into a double strand break. This will derail the
replication machine, the so called collapse of the replication fork. There must
be some way to salvage the fork and continue replication. One way is to copy
the information from an intact duplex (say, the second copy of the nascent
duplex), and then return to the original template beyond the stalling point.
Road blocks to replication can also
be present in the form of damaged bases (or abasic
sites). They are dealt with by a template switching mechanism or by lesion
polymerases that can insert bases against these damaged positions. Resulting
mismatches are corrected by repair systems.
C. Third, there is the polarity problem.
Recall that the two strands of the duplex have opposite polarity (5 to 3 and
3 to 5). As you advance along on strand, you must move in the opposite
direction at the same speed. How does the replication fork deal with this
directionality problem? You will get the answer in class in the Trombone model of replication.
D. In order to replicate very long DNA
molecules in a very short time, the replication machine has to be highly
processive. Once the replication machine is on its way, it should be able to go
a long way before it falls off the template, and a new machine has to be
assembled to continue the job. This task is accomplished with the help of a
processivity clamp (the beta clamp in E. coli; PCNA in eukaryotic cells) that
tethers the DNA polymerase holoenzyme complex to the template.
E. DNA polymerases in general cannot
start the synthesis of a DNA chain from the very beginning, that is, they
cannot initiate a chain. They can only elongate a chain. RNA polymerases, by
contrast, can start copying a template by inserting the first base and
continuing onward with the second, third, fourth base etc. etc. DNA synthesis
is initiated with the help of an RNA polymerase (primase) which makes a short
RNA chain, which is called an RNA primer. The DNA polymerase can then elongate
the primer.
F. The RNA primers must be removed from
the final DNA product and replaced by DNA. In E. coli, this is done by the DNA
polymerase Pol I. Details will be given in class.
G. Because of the polarity problem
discussed under C, There is one RNA
primer on one nascent strand (the leading strand), and multiple RNA primers on
the other one (the lagging strand). These are called Okazaki fragments. The
final DNA products must be rid of these fragments, and replaced by equivalent
DNA fragments.
H. For a covalently closed circular DNA
duplex (which is what the E. coli genome is), unwinding the strands will create
a negatively supercoiled (unwound) DNA domain behind the replication fork and a
compensatory positively supercoiled domain (overwound) in front of it. Remember
that if the strands are close within a DNA ring, there is no way to dissipate
the torsional stress generated by unwinding, except to incorporate an
equivalent stress in the opposite direction (overwinding). If the overwinding
continues to build up ahead of the fork, there will come a point when the
tension is so high that the replication machine will be slowed down and finally
come to a grinding halt. Hence the tension ahead of the replication fork is
relieved by cutting the strands and joining them with the help of
topoisomerases.
I. Because the DNA is plectonemically
coiled and since the E. coli genome is circular, the product circles formed by
replication will not be free circles. They are rather linked circles or
catenanes. This topological relationship between the parent duplex and the
daughter duplexes will be explained in class. The purpose of the replication
event is to create two identical daughter molecules that can be distributed
into the two daughter cells. If the duplicated DNA molecules remain linked,
they are no good for segregation. They must be unlinked with the help of a type
II topoisomerse (top IV in E. coli) before they can be passed on to the progeny
cells.
DNA Replication is semi conservative
In general, the replication of DNA
proceeds by a semi-conservative mechanism. The daughter duplexes produced
from one replication event will contain one parental strand and its
complementary nascent strand. If replication were conservative, one of he daughter molecules would contain both of the parental
strands and the second one would contain both of the nascent strands. Evidence
for the semi-conservative mechanism will be presented in the class.
Initiation of replication
The initiator protein DnaA binds to its recognition sequences
(9 bp boxes repeated four times) within the 250 bp origin region (OriC) of E.
coli chromosome. The HU protein assists the binding by bending DNA. DnaA
utilizes ATP, hydrolyzes it, and helps denature AT rich sequences. This
denaturation plus the inherent negative supercoiling of DNA helps open the
origin for initiation of DNA synthesis. The DnaC protein recruits the DnaB
helicase to the origin. The helicase is an ATP burning locomotive that moves
along DNA and unwinds DNA. The movement of the helicase is in the 5 to 3
direction. The opened up strands are bound by single strand binding protein SSB
which keeps them from reannealing.
At this origin bubble the primase protein (DnaG) lays down
RNA primers, one on the top strand and one on the bottom. Because of the
polarity of DNA strands, the RNA primers have their 3-OH ends pointed in
opposite directions. These primers are extended by the DNA polymerase
holoenzyme in opposite directions. This is the basis of bidirectional
replication with two replication forks, one moving leftward and the other
rightward.
Elongation of DNA
chains
Elongation, which involves the brunt of copying the DNA bases
in a genome, is carried out by a multi-protein complex (10 subunits), the Pol
III holoenzyme.
DNA polymerase: The polymerizing activity is
contained in the alpha (α) subunit, and the proof-reading activity in the
epsilon (ε) subunit. The tau (τ) subunit helps stable template
binding and dimerization of the two polymerase complexes. The gamma (γ),
delta (δ) and delta prime (δ) subunits form part of the β clamp
loading complex. There also three other subunits, theta (θ), chi (χ)
and psi (ψ) that form part of the holoenzyme.
Each replication fork consists of a dimer of the Pol III
complex. Note that you have to replicate two DNA strands.
The chemistry of the nucleotide elongation step is a simple
nucleophilic attack by the 3-hydroxyl of the primer on the alpha-phosphate of
the incoming dNTP (deoxynucleoside triphosphate). The result is the formation
of a 3-5 phosphodiester bond and the release of a pyrophosphate unit. This
same basic chemistry is repeated at every nucleotide addition step.
The lagging nascent DNA strand is synthesized in a discontinuous
fashion. An RNA primer is extended by the polymerase complex, the core subunits
dissociate, and the reassociate with a new clamp positioned at the next RNA
primer by the clamp loading complex. The primer is assembled by the primasome
which consists of the DnaB helicase and the DnaG primase. These discontinuous
RNA-DNA fragments are called Okazaki fragments.
Termination
There are
sequences roughly 180degrees with respect to OriC that signal the stop of
advancing forks and bring replication to an end. These multiple 20 bp sequences
are arranged in a directional manner. One set is functional in the context of
the leftward fork, the other in the context of the rightward fork. The
termination complex consists of the terminator protein Tus bound to the ter sequence.
The Tus protein interacts with the DNA B helicase, and prevents further
advancement of the replication fork. The mechanism by which the last segment of
DNA (that between the stopped left and right replication forks) is replicated
is not understood.
Additional points
1. The E. coli gyrase protein (a type II
topoisomerase) is part of the advancing replication fork. It is responsible for
relieving the torsional stress ahead of the replication machine due to
accumulation of positive supercoils.
2. The active site of Pol III must
accommodate all four dNTPS (A, G, C and T), hence the incorporation of the
correct base is determined at the level of complementary hydrogen bonding
between the template base and the incoming base. When a wrong base is brought
in, it does not pair correctly, and it is quickly ejected based on orientation
effects.
3. If misincorporation does occur, the
mismatch causes the fork to stall and block the incorporation of the next base.
This activates proof reading by the epsilon subunit which has a 3 to 5
exonuclease activity. The recessed chain is then elongated again by the 5 to
3 polymerase activity of Pol III alpha subunit.
4. The RNA primers are removed by the
repair polymerase Pol I. It has a 5 to 3 exonuclease activity and also a 5
to 3 DNA polymerase activity. As it chews away the RNA base by base, it also
incorporates the corresponding deoxynucleotide, thereby replacing RNA by DNA.
5. After the RNA primers have been
replaced with DNA, there will be nicks between the 3 OH end of the replaced
DNA segment and the 5-phopshate of the adjacent DNA. The nick is removed by
the enzyme E. coli ligase, which joins the 3-OH to the 5-phopshate using NAD
as a cofactor.
6. If there are any uncorrected
misincorporations still left in the daughter duplexes, they are corrected by
mismatch repair systems. These are enzymes that remove DNA strands containing
non-complementary bases, and fill in the correct ones by repair synthesis. The
correction is not random; it is strongly biased towards the template strand
being retained. This is important so that the original genetic information is
preserved, and mutations are avoided. The bias is mediated by a methylation
system that modifies Adenines within a target sequence ) by methylation. For
example, the Dam methylase acts on A within the GATC sequence. After completion
of DNA synthesis, there is a time lag before the nascent strand is methylated.
And the unmethylated strand is the one targeted by the repair system.
7. The catenated product duplexes are unlinked
by the action of topoisomerase IV prior to segregation.
Minimizing errors during replication
As noted, there are multiple steps at which mistakes are corrected. This
is expected of an evolutionarily optimized process directed at the faithful duplication
and propagation of genetic information.
The base selection step (by complementarity ) ensures that
misincorporation is limited to one every 10,000 to 100,000 bp (10-4
10-5). The proof reading activity increases it an additional
factor of a 100-1000. Hence the error rate drops to 10-6 to 10-8.
Mismatch repair further improves accuracy by a factor of a 100-1000. Thus, the
error rate comes down to one every 108 to 1011 bases
incorporated. Thus it takes a 100 to 10,000
replication events before a genome acquires a point mutation.
Coordinating replication with cell
division
Cells have developed mechanisms to coordinate DNA replication times with
cell division times. In higher cells the controls are quite elaborate, and
constitute the cell cycle: G1-S-G2-M. Each stage of the cell cycle is monitored
to ensure correct execution before proceeding to the next stage. The
surveillance mechanisms are called checkpoints. Bacteria also have a built in
cell cycle clock that times replication events and cell division events and
keep them in tune with each other. The logic of this control is much simpler in
bacteria than in higher systems. In E. coli, when cell division times are
altered according to nutrient status of the medium, the intervals between
firings of the replication origin (initiation events) are also modulated. The
pattern of this modulation will be discussed in class.
For an animated presentation of DNA
replication go the following websites ( you may have
to paste the URL address in the window of your browser):
Initiation of replication: http://www.contexo.info/DNA_Basics/replication%20move.htm
Leading and lagging strand synthesis: http://www.youtube.com/watch?v=teV62zrm2P0&NR=1
http://www.andrew.cmu.edu/user/berget/Education/TechTeach/replication/RepOver.html