Article15 March 2001free access Crystal structure of HIV-1 reverse transcriptase in complex with a polypurine tract RNA:DNA Stefan G. Sarafianos Stefan G. Sarafianos Center for Advanced Biotechnology and Medicine (CABM) and Rutgers University Chemistry Department, 679 Hoes Lane, Piscataway, NJ, 08854-5638 USA Search for more papers by this author Kalyan Das Kalyan Das Center for Advanced Biotechnology and Medicine (CABM) and Rutgers University Chemistry Department, 679 Hoes Lane, Piscataway, NJ, 08854-5638 USA Search for more papers by this author Chris Tantillo Chris Tantillo Center for Advanced Biotechnology and Medicine (CABM) and Rutgers University Chemistry Department, 679 Hoes Lane, Piscataway, NJ, 08854-5638 USA Search for more papers by this author Arthur D. Clark Jr Arthur D. Clark Jr Center for Advanced Biotechnology and Medicine (CABM) and Rutgers University Chemistry Department, 679 Hoes Lane, Piscataway, NJ, 08854-5638 USA Search for more papers by this author Jianping Ding Jianping Ding Center for Advanced Biotechnology and Medicine (CABM) and Rutgers University Chemistry Department, 679 Hoes Lane, Piscataway, NJ, 08854-5638 USA Search for more papers by this author Jeannette M. Whitcomb Jeannette M. Whitcomb ViroLogic, Inc., 270 E. Grand Avenue, S. San Francisco, CA, 94080 USA Search for more papers by this author Paul L. Boyer Paul L. Boyer HIV Drug Resistance Program, NCI-Frederick Cancer Research and Development Center, PO Box B, Frederick, MD, 21702-1201 USA Search for more papers by this author Stephen H. Hughes Stephen H. Hughes HIV Drug Resistance Program, NCI-Frederick Cancer Research and Development Center, PO Box B, Frederick, MD, 21702-1201 USA Search for more papers by this author Edward Arnold Corresponding Author Edward Arnold Center for Advanced Biotechnology and Medicine (CABM) and Rutgers University Chemistry Department, 679 Hoes Lane, Piscataway, NJ, 08854-5638 USA Search for more papers by this author Stefan G. Sarafianos Stefan G. Sarafianos Center for Advanced Biotechnology and Medicine (CABM) and Rutgers University Chemistry Department, 679 Hoes Lane, Piscataway, NJ, 08854-5638 USA Search for more papers by this author Kalyan Das Kalyan Das Center for Advanced Biotechnology and Medicine (CABM) and Rutgers University Chemistry Department, 679 Hoes Lane, Piscataway, NJ, 08854-5638 USA Search for more papers by this author Chris Tantillo Chris Tantillo Center for Advanced Biotechnology and Medicine (CABM) and Rutgers University Chemistry Department, 679 Hoes Lane, Piscataway, NJ, 08854-5638 USA Search for more papers by this author Arthur D. Clark Jr Arthur D. Clark Jr Center for Advanced Biotechnology and Medicine (CABM) and Rutgers University Chemistry Department, 679 Hoes Lane, Piscataway, NJ, 08854-5638 USA Search for more papers by this author Jianping Ding Jianping Ding Center for Advanced Biotechnology and Medicine (CABM) and Rutgers University Chemistry Department, 679 Hoes Lane, Piscataway, NJ, 08854-5638 USA Search for more papers by this author Jeannette M. Whitcomb Jeannette M. Whitcomb ViroLogic, Inc., 270 E. Grand Avenue, S. San Francisco, CA, 94080 USA Search for more papers by this author Paul L. Boyer Paul L. Boyer HIV Drug Resistance Program, NCI-Frederick Cancer Research and Development Center, PO Box B, Frederick, MD, 21702-1201 USA Search for more papers by this author Stephen H. Hughes Stephen H. Hughes HIV Drug Resistance Program, NCI-Frederick Cancer Research and Development Center, PO Box B, Frederick, MD, 21702-1201 USA Search for more papers by this author Edward Arnold Corresponding Author Edward Arnold Center for Advanced Biotechnology and Medicine (CABM) and Rutgers University Chemistry Department, 679 Hoes Lane, Piscataway, NJ, 08854-5638 USA Search for more papers by this author Author Information Stefan G. Sarafianos1, Kalyan Das1, Chris Tantillo1, Arthur D. Clark1, Jianping Ding1, Jeannette M. Whitcomb2, Paul L. Boyer3, Stephen H. Hughes3 and Edward Arnold 1 1Center for Advanced Biotechnology and Medicine (CABM) and Rutgers University Chemistry Department, 679 Hoes Lane, Piscataway, NJ, 08854-5638 USA 2ViroLogic, Inc., 270 E. Grand Avenue, S. San Francisco, CA, 94080 USA 3HIV Drug Resistance Program, NCI-Frederick Cancer Research and Development Center, PO Box B, Frederick, MD, 21702-1201 USA *Corresponding author. E-mail: [email protected] The EMBO Journal (2001)20:1449-1461https://doi.org/10.1093/emboj/20.6.1449 PDFDownload PDF of article text and main figures. ToolsAdd to favoritesDownload CitationsTrack CitationsPermissions ShareFacebookTwitterLinked InMendeleyWechatReddit Figures & Info We have determined the 3.0 Å resolution structure of wild-type HIV-1 reverse transcriptase in complex with an RNA:DNA oligonucleotide whose sequence includes a purine-rich segment from the HIV-1 genome called the polypurine tract (PPT). The PPT is resistant to ribonuclease H (RNase H) cleavage and is used as a primer for second DNA strand synthesis. The ‘RNase H primer grip’, consisting of amino acids that interact with the DNA primer strand, may contribute to RNase H catalysis and cleavage specificity. Cleavage specificity is also controlled by the width of the minor groove and the trajectory of the RNA:DNA, both of which are sequence dependent. An unusual ‘unzipping’ of 7 bp occurs in the adenine stretch of the PPT: an unpaired base on the template strand takes the base pairing out of register and then, following two offset base pairs, an unpaired base on the primer strand re-establishes the normal register. The structural aberration extends to the RNase H active site and may play a role in the resistance of PPT to RNase H cleavage. Introduction HIV-1 reverse transcriptase (RT) is a multifunctional enzyme that is responsible for copying the single-stranded viral RNA genome into double-stranded DNA (Telesnitsky and Goff, 1997). RT contains a DNA polymerase that can copy either an RNA or DNA template and a ribonuclease H (RNase H) activity that cleaves the RNA strand in RNA:DNA hybrids. In addition to degrading the RNA genome after it has been copied to DNA, the RNase H cleavages define the ends of the double-stranded genome that are the substrates for integration into the host genome. In vivo studies demonstrate that inactivation of RNase H results in non-infectious virus particles (Tanese and Goff, 1988; Schatz et al., 1989). HIV-1 RT is a heterodimer consisting of p66 and p51 subunits. Both subunits are derived from a gag-pol polyprotein, which is cleaved by the viral protease. The two subunits have a common N-terminus; p51 lacks the C-terminal RNase H domain present in p66. Crystal structures of HIV-1 RT complexes with DNA:DNA have been determined in the presence (Huang et al., 1998) and absence (Jacobo-Molina et al., 1993; Ding et al., 1998) of a bound dNTP. While both subunits contain fingers, palm, thumb and connection subdomains, the arrangement of the subdomains in the two subunits is very different (Kohlstaedt et al., 1992; Jacobo-Molina et al., 1993). The crystal structure of the binary complex of HIV-1 RT and a DNA:DNA substrate showed that the DNA:DNA substrate is bent by ∼40° (Jacobo-Molina et al., 1993; Ding et al., 1998). Near the polymerase active site the duplex adopts A-form geometry; near the RNase H active site the duplex adopts B-form geometry (Ding et al., 1997). The structure of the isolated RNase H domain of HIV-1 RT was solved (Davies et al., 1991) before the structure of the intact HIV-1 RT was known. The HIV-1 RNase H domain has a structure that is very similar to the RNase HI of Escherichia coli (Katayanagi et al., 1990; Yang et al., 1990) and of Thermus thermophilus (Ishikawa et al., 1993), but none of the RNases H has been cocrystallized with RNA:DNA, their natural substrate, and there are no published structures of HIV-1 RT in complex with an RNA:DNA duplex. Polypurine tract in retroviral replication Viral DNA synthesis (see Figure 1) is initiated from a cellular tRNA base paired to the genome of HIV-1 at the primer binding site (PBS). As the minus (−) DNA strand is synthesized, the RNA strand is digested by RNase H. The PBS is near the 5′ end of the genome. Degradation of the RNA by RNase H allows DNA synthesis to be transferred to the 3′ end of the RNA. After strand transfer, (−) strand synthesis can continue, accompanied by RNase H degradation of the RNA genome, but the degradation is not complete. The purine-rich polypurine tract (PPT) is resistant to RNase H cleavage and serves as the primer for plus (+) strand synthesis (reviewed in Telesnitsky and Goff, 1997). The PPT sequence is just 5′ of U3 (Figure 1). Removal of the PPT primer by RNase H defines the left end of the upstream long terminal repeat (LTR) (Figure 1E), which, together with the downstream LTR, is the substrate for the viral integrase enzyme that inserts the linear viral DNA in the host genome. Unlike many retroviruses that have only one PPT sequence, HIV-1 has a second copy of the PPT (central PPT) located near the center of the genome (Charneau et al., 1992). Mutations replacing purines by pyrimidines in the HIV-1 and TY1 central PPTs, which do not modify amino acid sequence, slow down viral growth (Charneau et al., 1992; Hungnes et al., 1992; Heyman et al., 1995), suggesting that the central PPT is important, but not necessary for replication of retroviruses. While the sequence of multiple copies of PPTs is identical, it is likely that the relative importance of a PPT is determined by the neighboring sequences that flank different copies of PPT. Figure 1.Process of reverse transcription of the HIV-1 genome. (A) Minus strand DNA synthesis (DNA strand in red) is initiated using a cellular tRNA annealed to the PBS. The RNA strand of the RNA:DNA duplex is degraded by RNase H of HIV-1 RT. (B) First strand transfer allows annealing of the newly formed DNA to the 3′ end of the viral genome. Transfer is mediated by identical repeated (R) sequences. (C) Minus strand DNA synthesis resumes, accompanied by RNase H digestion of all template RNA except PPT. (D) PPT is used as a primer for second strand DNA synthesis. (E) RNase H removes the tRNA and the PPT. In HIV-1, a single RNA nucleotide (from tRNA) is left by RNase H at the RNA/DNA PBS junction. (F) During second strand transfer (not shown) the newly formed PBS DNA (second strand) anneals to the PBS DNA from the first strand. Completion of second strand synthesis results in a linear DNA duplex with LTRs at both ends. Download figure Download PowerPoint There are at least three requirements for the end of the viral genome to be synthesized correctly. First, the PPT RNA must be resistant to cleavage by RNase H during (−) strand DNA synthesis (Figure 1C). Secondly, RNase H cleavage must occur precisely at the end of the PPT to generate the correct primer for the proper initiation of (+) strand DNA synthesis (Figure 1C). Thirdly, after the PPT primer has been used to initiate DNA synthesis, it must be precisely removed from the end of the viral DNA (Figure 1E). These three requirements have been extensively studied using biochemical methods in the HIV-1, avian sarcoma leukosis virus (ASLV) and Moloney murine leukemia virus (MuLV) systems (reviewed in Telesnitsky and Goff, 1997). The PPT is relatively resistant to RNase H degradation in vitro, although cleavages within the PPT can occur (Champoux et al., 1984; Resnick et al., 1984; Rattray and Champoux, 1989; Wöhrl and Moelling, 1990; Fuentes et al., 1995; Gao et al., 1998, 1999). There is evidence to suggest that both RT and the PPT are important for determining the specificity of cleavage and for controlling plus strand priming: (i) mutations at the primer grip or thumb subdomain of RT dramatically affect the ability of the enzyme to cleave specifically at the 3′ end of the PPT (Ghosh et al., 1997; Palaniappan et al., 1997; Powell et al., 1997, 1999; Gao et al., 1998); (ii) the isolated RNase H domain of MuLV RT exhibits different cleavage specificity of PPT than the intact MuLV RT (Zhan and Crouch, 1997); (iii) with the exception of a few changes at the G-rich 3′ end of PPT, most single base mutations of the PPT sequence can be tolerated without altering the specificity of cleavage and plus strand priming by HIV-1 RT (Huber et al., 1989; Rattray and Champoux, 1989; Luo et al., 1990; Wöhrl and Moelling, 1990; Pullen et al., 1993; Powell and Levin, 1996); and finally, (iv) the NMR structure of an 8 bp RNA:DNA oligonucleotide containing the last four residues of PPT (with a mutation of one G to A) and the first 4 bp of U3, shows that the width and shape of the major groove is unusual (Fedoroff et al., 1997). Despite the considerable body of experimental data, however, the molecular details of PPT recognition remain unclear both in terms of the generation of the primer and its removal. Mechanism and specificity of RNase H cleavage NMR structural studies (Fedoroff et al., 1993; Lane et al., 1993) have suggested that RNA:DNA duplexes are neither A- nor B-form structures in solution. This led to the hypothesis that RNase H distinguishes DNA:RNA and RNA:RNA duplexes by recognizing differences in the width of the minor groove, and suggested that a minor groove width of ∼9–10 Å should be optimal for efficient recognition by RNase H (Fedoroff et al., 1993, 1997; Salazar et al., 1994; Zhu et al., 1995; Horton and Finzel, 1996; Han et al., 1997; Bachelin et al., 1998; Szyperski et al., 1999). In some cases, however, RNase H can cleave single-stranded RNA adjacent to the RNA:DNA duplex region, albeit with low efficiency (Gao et al., 1997; Lima and Crooke, 1997). Furthermore, under certain conditions the RNases H of both MuLV RT and HIV-1 RT can cleave an RNA:RNA substrate (Ben-Artzi et al., 1992; Blain and Goff, 1993; Smith and Roth, 1993; Hostomsky et al., 1994; Götte et al., 1995). Finally, HIV-1 RT can cleave chimeric hybrid duplexes (RNA-DNA annealed to DNA) at the RNA/DNA junction where the minor grooves tend to be very narrow [as small as 4.5–5.5 Å (Szyperski et al., 1999)] and bent (Salazar et al., 1994; Szyperski et al., 1999). These data suggest that the specificity of RNase H cleavage does not depend solely on the width of the minor groove (Szyperski et al., 1999). In an effort to study the interactions of HIV-1 RT with RNA:DNA, to discern the molecular details of PPT recognition and to understand better the mechanism and specificity of RNase H cleavage, we determined the 3.0 Å crystal structure of HIV-1 RT in complex with a PPT-containing RNA:DNA substrate and the Fab fragment of a monoclonal antibody. The PPT-containing RNA:DNA oligonucleotide (r31:d29) was bound to HIV-1 RT with the 3′ end of DNA at the polymerase active site and the middle part of the PPT (defined as a stretch of rAs) near the RNase H active site (Figure 2). This binding mode was designed to investigate the inefficient cleavage of the PPT by HIV RNase H. This complex has extensive protein– nucleic acid interactions, and the nucleic acid has unusual structural features that provide a basis for understanding why the PPT is resistant to degradation by HIV-1 RNase H. Figure 2.Top: HIV genome sequence at the PPT (underlined) and U3 region. The minus strand synthesis initiation site is marked with an asterisk; +1 is the first nucleotide of U3. Bottom: sequence of the RNA:DNA oligonucleotide in our RT–RNA:DNA complex. Download figure Download PowerPoint Results and discussion Overall structure of the HIV-1 RT–RNA:DNA complex The overall conformation of the protein in the HIV-1 RT–RNA:DNA complex (Figure 3) is similar to that in the HIV-1 RT–DNA:DNA complex. There are extensive interactions between the nucleic acid and amino acids of all subdomains of the p66 subunit; there are also interactions between the nucleic acid and the fingers and connection subdomains of the p51 subunit (Figures 3 and 4). The nucleic acid binding cleft is ∼60 Å in length, extending from the polymerase active site to the RNase H active site. The distance in nucleotides between the polymerase and RNase H catalytic sites in this structure is 18 bp, which agrees with past biochemical studies with HIV-1 RT (Schatz et al., 1990; Wöhrl and Moelling, 1990; Gopalakrishnan et al., 1992; Ghosh et al., 1995; Götte et al., 1995; Gao et al., 1998, 1999). In HIV-1 RT complexes containing double-stranded DNA template-primers, the distance between the polymerase and RNase H active sites is 17 bp (Jacobo-Molina et al., 1993; Huang et al., 1998). This nucleic acid-dependent difference in the distance between the polymerase and RNase H active sites agrees with previous biochemical studies (Götte et al., 1998). The final model of RNA:DNA template-primer includes 45 nucleotides (23 template, 22 primer) encompassing more than two full helical turns of RNA:DNA. The major groove is exposed to the solvent, as might be expected for a non-specific DNA binding protein. The electron density of the nucleic acid is stronger in regions where there are extensive interactions with the enzyme, including the polymerase and RNase H active sites. Simulated annealing omit maps for nucleotides in key locations are shown in Figure 5. Figure 3.Stereo view of a ribbon representation of the structure of HIV-1 RT in complex with the polypurine RNA:DNA. The fingers, palm, thumb, connection and RNase H subdomains of p66 are colored blue, red, green, yellow and orange, respectively. The p51 subunit is colored gray. The RNA template and DNA primer strands are shown in magenta and blue, respectively. Download figure Download PowerPoint Figure 4.The sequence and numbering scheme of the RNA:DNA PPT and the interactions between the nucleic acid and amino acid residues of HIV-1 RT (≤3.8 Å). The RNA (orange) and DNA (cyan) strands are designated Tem and Pri, respectively. The nucleotide site positions are labeled with ascending numbers from the polymerase domain toward the RNase H domain. Amino acids of the p51 subunit are designated by an asterisk following the residue number; all others are in p66. RNase H nucleotide site positions are designated positive (+1 to +4) for positions 3′ to, and negative (−1 to −9) for positions 5′ to, the scissile phosphate, where the 3′ and 5′ orientations are for the RNA strand. Hydrogen bonds are shown in red dashed lines and other types of interaction are shown in solid black lines. 2′-OH groups of RNA and phosphate groups are shown in red and gray spheres. Weakly paired (distance ≥3.6 Å), mismatched and unpaired bases are shown filled with stripes, spheres and empty, respectively. Residues Gly359 and Ala360 of the RNase H primer grip interact with the nucleic acid through their main-chain atoms. Arg284 was modeled as Ala because of weak density for the side chain. N474 interacts with Pri15-Thy through a water molecule (not shown). Download figure Download PowerPoint Figure 5.Simulated annealing (Fo − Fc) omit electron density maps contoured at the 2σ level at the polymerase active site (1) (omitting nucleic acid) and of the unpaired residue of template (2) (omitting unpaired residue Tem-15-Ade). Download figure Download PowerPoint RT has similar interactions with the DNA primer strand in RT–RNA:DNA and RT–DNA:DNA structures While the contacts between RT and the DNA primer strands are very similar in the RT complexes with RNA:DNA and DNA:DNA template-primers, the contacts with the RNA template (Figures 3 and 4) are different from those with the DNA template. However, there are relatively modest changes in the protein structure. As seen in the RT–DNA:DNA structure, many of the RT contacts with the nucleic acid in the RT–RNA:DNA complex involve the sugar–phosphate backbone, consistent with the fact that RT can copy a wide variety of different templates. RT has numerous interactions with 2′-OH groups of the RNA template in the RT–RNA:DNA complex. Such interactions (indicated in magenta in Figure 6) include residues 280 and 284 of helix I of the p66 thumb, residues of the template grip including 89 and 91 of the p66 palm and residues of the RNase H domain (Figures 4 and 6). The more extensive contacts between RT and RNA:DNA versus DNA:DNA may account for the increased polymerization activity and processivity of the enzyme with RNA templates. Figure 6.Molecular surface representation of HIV-1 RT showing the nucleic acid binding cleft and the RNase H primer grip. Residues colored in cyan or magenta are amino acids within 3.8 Å of the 2′-OH of RNA template nucleotides (magenta) or any other part of nucleic acid (cyan). The RNA template is shown as red ribbon and the DNA primer in blue ribbon. Minor groove widths proximal to the thumb area or at the RNase H active site are indicated (∼10 and ∼8 Å, respectively). The trajectory and minor groove width of a hypothetical RNA strand that can be cleaved efficiently by RNase H are shown in red. The RNase H primer grip region is shown in ball and stick representation in the figure inset. Download figure Download PowerPoint There is also an increased involvement of the p51 subunit in binding the RNA:DNA relative to the DNA:DNA template-primer. Although residues Lys395 and Glu396 of the p51 subunit interact with both the DNA:DNA and RNA:DNA duplexes (with Pri10-Ade and Pri11-Ade, Figure 4), residues Lys22 and Lys390 of p51 interact only with the RNA:DNA duplex (with the phosphates of Tem4-Gua and Tem16-Ade). Nucleic acid geometry The nucleic acid geometry was analyzed using CURVES (Lavery and Sklenar, 1988) and SCHNAP (Lu et al., 1997). Both programs yielded similar results. Despite the difference in the nature of the duplex (RNA:DNA versus DNA:DNA), the sequence (PPT versus PBS) and the length (31:29 versus 19:18), the overall conformations of these two nucleic acids are remarkably similar. Both have a bend of ∼40° with the helical curvature occurring smoothly over bp 5–9 from the polymerase active site (Jacobo-Molina et al., 1993; Ding et al., 1998) (Figure 7). This bend is a hallmark of nucleic acids bound to a variety of polymerases and is associated with a transition from A- to B-form geometry. Figure 7.Stereo view of structures of the nucleic acid template-primers in the RT–DNA:DNA (Ding et al., 1998) and RT–RNA:DNA complexes. The 19mer DNA and 31mer RNA templates are shown in yellow and magenta, respectively. The 18mer DNA and 29mer DNA primers are cyan and blue. Region I contains the 4 bp near the polymerase active site. Region II consists of the next 4 bp at the bend of the nucleic acid. The next 5 bp compose region III, followed by region IV that contains residues of the ‘unzipped’ part of PPT. Download figure Download PowerPoint The helical parameters of the RNA:DNA duplex bound to HIV-1 RT were compared with values for canonical A- and B-form DNA:DNA and A-form RNA:RNA duplexes and with other RNA:DNA hybrids (Table I). The values obtained for the RNA:DNA in complex with RT were averaged within four separate regions of the nucleic acid. Region I contains the 4 bp near the polymerase active site. Region II consists of the next 4 bp at the bend of the nucleic acid. Region III includes the next 5 bp, followed by region IV, which contains residues of the ‘unzipped’ portion of the PPT (see below). The parameters that define nucleic acid conformation (Table I) suggest that none of the four regions has canonical A- or B-type geometry. However, the geometry of region I is significantly closer to that of A-form than the other regions. Regions II–IV have a conformation closer to an intermediate between A- and B-form, consistent with the suggestion of Arnott et al. (1986) that RNA:DNA duplexes adopt ‘H-form’ geometry. The H-form conformation is characterized by values for the inclination of the base pairs with respect to the helical axis, the dislocation of the base pairs from the helix axis (Xdisp) and the helical rise, which are intermediate between those of canonical A- and B-form helices (Table I). NMR studies of RNA:DNA duplexes in solution showed that unliganded RNA:DNA duplexes adopt H-form geometry (Fedoroff et al., 1993; Lane et al., 1993). The width of the minor groove of the RNA:DNA (PPT) varies between canonical A- and B-forms (Figure 8), however, and it is closer to B-form in the stretch of rA:dTs (region IV, average minor groove width 7 Å). This is similar to the minor groove width in B-DNA (6 Å) determined by X-ray diffraction of DNA fibers (Chandrasekaran et al., 1989), and markedly lower than the average minor groove width in the RNA:DNA duplexes whose structures have been solved by NMR [∼9 Å, (Arnott et al., 1986; Fedoroff et al., 1993; Lane et al., 1993; Han et al., 1997)] or crystallography (8.7–10.5 Å) (Horton and Finzel, 1996), as shown in Figure 8. A purine-rich sequence with only two consecutive As (RNA:DNA 5′-GAAGAAGAA:CTTCTTCTT) had a considerably wider minor groove (9.4–10.1 Å) (Xiong and Sundaralingam, 1998), suggesting that the narrowing of the minor groove in 5′-oligo(rA): 3′-oligo(dT) tracts of RNA:DNA follows the same rules as in DNA:DNA [5′-oligo(dA):3′-oligo(dT) tracts], i.e. maximal narrowing of the minor groove requires at least four consecutive As. The narrow minor groove we report here for the rA:dT tract of the PPT is remarkably close to the minor groove narrowing predicted by Dickerson and coworkers (Han et al., 1997). Figure 8.Variation in minor groove width of the RNA:DNA template-primer. The four regions of nucleic acid are defined in the legend of Figure 7. The minor groove width values for canonical A- and B-type DNA are 11 and 6 Å, respectively. Download figure Download PowerPoint Table 1. Nucleic acid parameters Xdisp (Å) Inclination (°) Rise (Å) Twist (ω) Slide (Å) Region I (bp 1–4) −2.8 0.1 3.1 31.5 −0.1 Region II (bp 5–8) −2.5 −1.5 3.2 33.8 −0.1 Region III (bp 9–13) −2.3 −3.7 3.3 33.2 −0.2 Region IV (‘unzipped’ 14–19) −2.0 −3.9 3.2 32.2 0.1 RNA:DNAa −4.7 12.1 2.9 30.4 −1.6 RNA:DNAb −3.7 5.5 3.1 32.0 −1.9 RNA:DNAc −3.3 13.9 2.9 33.0 −1.2 RNA:DNAd 19 2.5 33.1 A-RNAe 13 2.8 32.7 −1.1 A-DNAf −4.1 12 2.9 31.1 −1.6 B-DNAf −0.14 −6 3.4 36.0 0.4 r(R10)d(Y10)g −3.3 6 2.9 33.7 r(R10)r(Y10)h −5.2 8.1 2.6 31 a Conn et al. (1999). b Fedoroff et al. (1997). c Horton and Finzel (1996). d Wang et al. (1982). e Leonard et al. (1994). f Dickerson (1992). g Gyi et al. (1996). h Gyi et al. (1998). The geometry of the RNA:DNA hybrid in complex with RT is considerably less uniform than the geometry of unliganded RNA:DNA duplexes, either in solution or in crystals. The variation in the width of the minor groove of the protein-bound RNA:DNA complex (up to 4.5 Å) (Figure 8) is considerably larger than that of free RNA:DNA hybrids (typically 0.5–2 Å), suggesting that the structure of the RNA:DNA duplex is significantly affected by the contacts with the enzyme. The effects of RT are manifested in several ways: (i) the RNA:DNA template-primer has a bend near the RT polymerase active site that is similar to the bend in complexes of RT and a DNA:DNA duplex; (ii) both the RNA:DNA (PPT) and DNA:DNA (PBS) duplexes have similar A-form geometry near the RT polymerase active site, where the majority of the protein–nucleic acid contacts occur; and (iii) there are structural irregularities (discussed below in more detail) in the RNA:DNA (PPT) that may be either caused or stabilized by specific contacts with the enzyme (Figure 4). Role of A-tracts in the structure of PPT A-tracts [in this case, stretches of four or more consecutive 5′-oligo(dA):3′-oligo(dT) (dA:dT) or 5′-oligo(rA): 3′-oligo(dT) (rA:dT)] have long been known to display several unusual features: they are straight (Han et al., 1997) and have a narrow minor groove and a large propeller twist, presumably because there are only two rather than three hydrogen bonds between the bases. A-tracts (dA:dT) are highly resistant to reconstitution around nucleosome cores (Rhodes, 1979) and interact poorly with TATA binding protein, which requires an ∼80° bend at the binding site (Kim et al., 1993). Integration host factor (IHF), a small protein that specifically recognizes an A-tract (dA:dT) by binding the phosphates of the narrow minor groove, appears to recognize an A-rich (dA:dT) DNA duplex by structure rather than by base-specific contacts (Rice et al., 1996). It is possible that special structural features of A-tracts (rA:dT) are recognized by HIV RT and affect the specificity of RNase H cleavage. While A-tracts in DNA (dA:dT) are unbent, bends are commonplace at junctions between A-tracts (dA:dT) and a dG:dC base pair (Dickerson et al., 1996). A dG:dC/dA:dT junction has been reported to be a flexible hinge, capable of adopting either a straight or a bent conformation under the influence of local forces (Dickerson et al., 1994). While this information is based primarily on crystallographic studies with DNA:DNA oligonucleotides, the same factors, i.e. large propeller twist and narrow minor groove, exist in an rA:dT RNA:DNA duplex (Table I). Therefore, the PPT sequence may have a natural propensity for (i) bending at the rG:dC/rA:dT junction and for (ii) 'stiffness’ in the stretches that contain multiple As. These two properties, combined with the extensive protein–nucleic acid interactions in this region (Figure 4; Table II), may contribute to an unusual structural deformation that we term ‘unzipping of the PPT’, which is centered at the rG:dC/rA:dT junction and is discussed below. Furthermore, they are likely to affect the trajectory of the template-primer and its positioning at the RNase H active site. Table 2. Template