当前位置: 首页 > 医学版 > 期刊论文 > 基础医学 > 病菌学杂志 > 2005年 > 第14期 > 正文
编号:11202818
A Downstream Polyadenylation Element in Human Papi
     Department of Medical Biochemistry and Microbiology, Uppsala University, BMC, Box 582, 751 23 Uppsala, Sweden

    Dublin Institute of Technology, Kevin Street, Dublin 8, Ireland

    ABSTRACT

    Production of human papillomavirus type 16 (HPV-16) virus particles is totally dependent on the differentiation-dependent induction of viral L1 and L2 late gene expression. The early polyadenylation signal in HPV-16 plays a major role in the switch from the early to the late, productive stage of the viral life cycle. Here, we show that the L2 coding region of HPV-16 contains RNA elements that are necessary for polyadenylation at the early polyadenylation signal. Consecutive mutations in six GGG motifs located 174 nucleotides downstream of the polyadenylation signal resulted in a gradual decrease in polyadenylation at the early polyadenylation signal. This caused read-through into the late region, followed by production of the late mRNAs encoding L1 and L2. Binding of hnRNP H to the various triple-G mutants correlated with functional activity of the HPV-16 early polyadenylation signal. In addition, the polyadenylation factor CStF-64 was also found to interact specifically with the region in L2 located 174 nucleotides downstream of the early polyadenylation signal. Staining of cervix epithelium with anti-hnRNP H-specific antiserum revealed high expression levels of hnRNP H in the lower layers of cervical epithelium and a loss of hnRNP H production in the superficial layers, supporting a model in which a differentiation-dependent down regulation of hnRNP H causes a decrease in HPV-16 early polyadenylation and an induction of late gene expression.

    INTRODUCTION

    The human papillomaviruses (HPVs) are small DNA tumor viruses (20). To date, more than 100 different types have been identified (12). Some of these HPV types, termed high-risk types, are associated with development of cancer of the uterine cervix, one of the most common cancers in women worldwide (35, 50). HPV type 16 (HPV-16) is the most common high-risk type (51). The HPV genome can be divided into an early region and a late region, followed by the proximal early (pAE) and the distal late (pAL) polyadenylation signals, respectively (Fig. 1A) (4). A polyadenylation signal consists of an AAUAAA sequence located 10 to 30 nucleotides (nt) upstream of the cleavage site and a degenerate GU-rich sequence element about 30 nucleotides downstream of the cleavage site (45). Some polyadenylation signals are followed by G-rich downstream elements (2, 3). The weak binding of cleavage/polyadenylation specificity factor (CPSF) to the AAUAAA motif is enhanced by CStF binding to the downstream GU-rich element (45). Recently, it was shown that U-rich upstream polyadenylation elements interact with hFip-1, an integral part of CPSF (21). In the early stage of the viral life cycle, all viral transcripts are polyadenylated at the pAE, whereas both the pAE and the pAL are used in the late, productive stage of the infection (26). In response to differentiation of the HPV-infected cell, the use of the pAE is down regulated, resulting in read-through into the late region and polyadenylation of the late transcripts at the pAL. Efficient polyadenylation at the pAE is an absolute requirement for early gene expression and for inhibition of late gene expression at an early stage in the viral life cycle. The pAE has a key regulatory role in the switch from the early to the late productive stage of the viral life cycle, marked by the induction of late gene expression. It is therefore of interest to determine how polyadenylation at the pAE is regulated. The HPV-16 early untranslated region (UTR) contains a U-rich upstream sequence element that interacts with the newly identified polyadenylation factor hFip-1 (21), suggesting that hFip-1 is required for recognition of the HPV-16 pAE (48). bovine papillomavirus type 1 (BPV-1) appears to contain downstream elements in the L2 coding region that affect early polyadenylation (5). Polyadenylation at the pAE in HPV-31 is dependent on RNA elements extending 800 nucleotides into the L2 coding region (42). It was shown that these elements encoded multiple binding sites for the CStF-64 polyadenylation factor, suggesting that these RNA elements in L2 acted in concert to promote early polyadenylation in HPV-31 (43). In earlier studies, we found that the HPV-16 L2 coding region contains RNA elements that efficiently block expression of L2 from L2 cDNA expression plasmids transiently transfected into HeLa cells (9, 10, 30, 37). This primarily affected the L2 mRNA half-life (37) and translation (9). A mutant (Mut) L2 gene was generated in which the L2 RNA sequence, but not the L2 protein sequence, was altered (30). This mutant L2 gene produced high levels of L2 mRNA and L2 protein, indicating that inhibitory RNA elements in L2 had been inactivated (30). We speculated that the role of these elements in the viral life cycle was to regulate HPV-16 polyadenylation and/or splicing (30, 32-34). We previously identified inhibitory elements in the 5' end of the HPV-16 L1 coding region (10, 40) and later showed that they are exonic splicing silencers that inhibit HPV-16 late mRNA splicing (49). In the present investigation, we further studied the inhibitory elements in the HPV-16 L2 coding region. We used wild-type (wt) and mutant L2 genes in order to determine the effects of the L2 RNA elements on HPV-16 mRNA polyadenylation and splicing. Our results show that the HPV-16 L2 sequence encodes a polyadenylation element, which encompasses multiple GGG motifs that interact with hnRNP H. We speculate that this element and hnRNP H regulate polyadenylation at the HPV-16 early polyadenylation signal.

    MATERIALS AND METHODS

    Plasmid constructions. The sequences of oligonucleotides used in generating all constructs are presented in Table 1 in the order they appear in Materials and Methods.

    (i) pBEX constructs. The pBELDP construct has been described previously (48). pBEX differs from pBEL (49) in that it contains a unique XbaI site in the E5 open reading frame and no BssHII site in the E4 open reading frame. pBEX was generated in three cloning steps. (i) PCR amplification was performed on the complete HVP-16 genome using sense primer E5startSalI/XbaI with antisense primer L1stop(XhoI). The product was cloned into the previously described pCL0806 (10) plasmid using SalI/XhoI. (ii) L1 sequences from nucleotides 514 to 1514 containing a mutation of a cryptic polyadenylation site (30) were inserted with BamHI and XhoI, replacing wild-type L1 sequences. (iii) Primers 757sense and E5startSalI/XbaI were used to amplify the 5' end of the HPV-16 genome. This fragment was inserted into pL0806 (10), producing pBEX. pBEXM was generated by insertion of an ApaI-BamHI fragment from pBELM (49) into pBEX. pBEX-L2M1-299 and pBEXM-L2M1-299 were generated by PCR mutagenesis. The first fragment was PCR amplified from pBEX using primers E5startSalI/XbaI and helL2mut(as) and the second using primers helL2ms and HPV16#2shorta.s. on pC16L2mut (30). The two fragments were fused by PCR mutagenesis and inserted with XbaI-ApaI into pBEX. pBEX-L2 M and pBEXM-L2 M were generated by PCR mutagenesis. The first fragment was PCR amplified from pC16L2mut using primers L2m(SalI)nt93sense and L2antifusion and the second from pBEX or pBEXM using primers L2sensefusion and L1stop(XhoI). The two fragments were fused by PCR mutagenesis and inserted into pBEX-L2M1-299 or pBEXM-L2M1-299 using ApaI and BamHI. pBEX-PPT and pBEXM-PPT were generated by PCR mutagenesis. The first fragment was amplified from pBEX using primers E5startSal/XbaI and PPTmutA.S. and the second using primers PPTmutsense and L2wt293a.s. The two fragments were fused by PCR mutagenesis and cloned with XbaI-ApaI into pBEX-L2M1-299 and pBEXM-L2M1-299. To generate pBEX-G1-6 and pBEXM-G1-6, a PCR mutagenesis series was performed. The first fragment was amplified from pBEX using primers E5startSalI/XbaI and 1stGGGmutA.S. and the second using primers 1stGGGmutsense and L2wt293a.s. The two fragments were fused by PCR mutagenesis to produce the XbaG1Apa fragment. XbaG1Apa was used as a template in PCR mutagenesis to generate XbaG1-2Apa (primer pairs E5startSalI/XbaI-2ndGGGmutA.S. and 2ndGGGmutsense-L2wt293a.s.). XbaG1-2Apa was further used as a template in PCR mutagenesis to generate XbaG1-3Apa (primer pairs E5startSalI/XbaI-3rdGGGmutA.S. and 3rdGGGmutsense-L2wt293a.s.). XbaG1-3Apa was further used as a template in PCR mutagenesis to generate XbaG1-4Apa(primer pairs E5startSalI/XbaI-4thGGGmutA.S. and 4thGGGmutsense-L2wt293a.s.). XbaG1-4Apa was further used as a template in PCR mutagenesis to generate XbaG1-5Apa (primer pairs E5startSalI/XbaI-5thGGGmutA.S. and 5thGGGmutsense-L2wt293a.s.). XbaG1-5Apa was further used as a template in PCR mutagenesis to generate XbaG1-6Apa (primer pairs E5startSalI/XbaI-6thGGGmutA.S. and 6thGGGmutsense-L2wt293a.s.). XbaG1-6Apa was cloned with XbaI/ApaI into pBEX-L2 M and pBEXM-L2 M. pBEX-del30 and pBEXM-del30 were generated by PCR mutagenesis. The first fragment was amplified from pBEX using primers E5startSal/XbaI and del30A.S. and the second using primers del30S and L2wt293a.s. The two fragments were fused by PCR mutagenesis and cloned with XbaI-ApaI into pBEX-L2M1-299 and pBEXM-L2M1-299. pBEX-0-frag and pBEXM-0-frag were generated by PCR mutagenesis. The first fragment was amplified from pBEX using primers E5startSal/XbaI and 0-fragA.S. and the second using primers 0-fragS and L2wt293a.s. The two fragments were fused by PCR mutagenesis and cloned with XbaI-ApaI into pBEX-L2M1-299 and pBEXM-L2M1-299. pBEXDP plasmids with L2 alterations were constructed in the same way as their corresponding pBEX constructs but using the previously described pBELDP construct (48).

    (ii) pCATL2 hybrids. The pCATL2 construct was generated in two steps. (i) HPV-16 genomic sequences were PCR amplified using primers E5stopsense and L2stopA.S., followed by insertion into pL0806 using ApaI and BamHI. (ii) The CAT sequence was amplified using primers CATstartSalI and CAT(BamHI)as and inserted upstream of the fragment described above using SalI and BamHI. pCATL2 M was generated in two steps. (i) PCR amplification of pBEX-L2 M using primers E5stopsense and HPV16#2shorta.s. The fragment was cloned with BamHI-ApaI into pCATL2, generating pCATL2M1-299. L2mut sequences were inserted into pCATL2M1-299 using ApaI and XhoI, generating pCATL2 M. p#1-411 was generated by insertion of an ApaI-XhoI fragment from pC16L2mut4-10 (30) into pCATL2. p#1-299 was generated by insertion of an ApaI-XhoI fragment from pC16L2mut into pCATL2. p#1-203 was generated by PCR amplification of HPV-16 sequences from pBEX using primers E5stopsense and L2wt203a.s., followed by insertion into pCATL2 M using BamHI and EagI.

    (iii) p#1-299 deletions. The p3del, p30del, p51del, p81del, p111del, p168del, and p224del constructs were generated by PCR amplification of pBEX using primer E5stopsense in combination with L2(ApaI)nt3as, L2(ApaI)nt30as, L2(ApaI)nt51as, L2(ApaI)nt81as, L2(ApaI)nt111as, L2(ApaI)nt168as, and L2(ApaI)nt224as, respectively. The fragments were inserted into pCATL2 M using BamHI and ApaI. The p7, p9, p8, p10, p12, p14, and p11 plasmids were generated by PCR amplifying HPV-16 sequences from pBEX using the primer pairs L2(BssHII)nt39sense and L2(ApaI)nt111as, L2(BssHII)nt93sense and L2(ApaI)nt168as, L2(BssHII)nt147sense and L2(ApaI)nt224as, L2(BssHII) nt201sense and L2wt293a.s., L2(BssHII)nt39sense and L2(ApaI)nt224as, L2(BssHII)nt147sense and L2wt293a.s., and L2(BssHII)nt93sense and L2(ApaI)nt224as, respectively. The fragments were inserted into pCATL2 M using BssHII and ApaI.

    (iv) pCATL2 point mutations. pG1m, pG1-2m, pG1-3m, pG1-4m, pG1-5m, and pG1-6m were generated by PCR amplification of fragments XbaG1Apa, XbaG1-2Apa, XbaG1-3Apa, XbaG1-4Apa, XbaG1-5Apa, and XbaG1-6Apa, respectively, using primers E5stopsense and L2wt293a.s. The fragments were inserted into pCATL2 M using BamHI and ApaI. pPPTm was generated by PCR amplification of pBEX-PPT using primers E5stopsense and L2wt293a.s. The fragment was inserted into pCATL2 M using BamHI and ApaI. pdel30m and p0-FRAGm were generated by PCR mutagenesis. The primary fragment was PCR amplified from pBEX using primer pairs E5stopsense and del30A.S. and E5stopsense and 0-fragA.S. The second fragment was amplified from pBEX using primer pairs del30S and L2wt293a.s. and 0-fragS and L2wt293a.s. The two fragments were fused by PCR mutagenesis and inserted into pCATL2 M using BamHI/ApaI. pCATDP was generated by PCR amplification of pBELDP (48) using primers E5stopsense and L2wt293a.s. The fragment was inserted into pCATL2 M using BamHI and ApaI.

    (v) T7 constructs. The pT71-111wt construct was generated by PCR amplification of pBEX using primers L2startBssHII (earlier described [30]) and L2(ApaI)nt111as. The fragment was inserted into the previously described pUC19T7 (49) using BssHII and ApaI. pT793-168wt, pT7147-224wt, and pT7203-299wt were generated by insertion into pUC19T7 of fragments described in "p#1-299 deletions" above using BssHII and ApaI. pT71-111Mut, pT793-168Mut, pT7147-224Mut, and pT7203-299Mut were generated by insertion into pUC19T7 of PCR fragments amplified from pC16L2 M using primer pairs L2mutstart (earlier described [30]) and L2m(ApaI)nt111as, L2m(SalI)nt93sense and L2m(ApaI)nt168as, L2m(SalI)nt147sense and L2m(ApaI)nt224as, and L2m(SalI)nt201sense and HPV-16#2shorta.s, respectively. pT730G/U was generated by annealing oligonucleotides 30G/UA.S and 30G/US, followed by insertion into pUC19T7 using BamHI and HinDIII. pT7PPT, pT7G2-6, and pT7G2-4 were generated by insertion into pUC19T7 of BssHII- and ApaI-digested PCR fragments amplified from pBEX-PPT, XbaG1-6Apa, and XbaG1-4Apa, respectively, using primers L2(BssHII)nt147sense and L2(ApaI)nt224as.

    (vi) pTHCStF-64 RBD. pTHCStF-64 RBD was generated by PCR amplification of the RNA binding domains of CStF-64 from glutathione S-transferase (GST)-CStF-64 (generously provided by C. Milcarek) using primers CstF64hisNheIs and CstF64hisHindIIIas. The fragment was subscloned with NheI and HindIII into pTrcHis plasmid (Invitrogen).

    pC16EIAV, used as an internal control in transient transfections, has been described previously (10). Subclonings were performed with a pCR 4-TOPO cloning kit (Invitrogen). Plasmids were purified with a QIAGEN Plasmid Maxi kit.

    Transfections, RNA extractions, and Northern blotting. Transfections were performed in HeLa cells according to the Fugene 6 method (Roche Molecular Biochemicals). All experiments were performed at least three times, and sample variation within a transfection series was less than 20%. Briefly, 1 μg of DNA was mixed with 3 μl of Fugene 6 and subsequently added in 200-μl aliquots consisting of DNA, Fugene 6, and Dulbecco's modified Eagle's medium to 60-mm plates containing subconfluent HeLa cells. The transfected cells were harvested at 24 h posttransfection, and total RNA was prepared according to the RNeasy Mini protocol (QIAGEN). Northern blot analysis was performed by the separation of 10 μg of total RNA on a 1% agarose gel containing 2.2 M formaldehyde, followed by transfer to a nitrocellulose filter and hybridization, as described previously (10, 46). Random priming of the DNA probe was performed using a Decaprime kit (Ambion) according to the manufacturer's instructions. All Northern blots were quantified in a Bio-Rad phosphorimager (GS-250). The cytomegalovirus (CMV) and L1 probes have been described previously (10, 49). The template for the chloramphenicol acetyltransferase (CAT) probe was generated by SalI and BamHI digestion of pCATL2.

    Preparation of cell extracts and recombinant protein, UV cross-linking and immunoprecipitation. HeLa cell nuclear and cytoplasmic S100 extracts were prepared according to the method of Dignam et al. (13). His-tagged CStF-64RBD encompassing the RNA binding domain was expressed from the plasmid pTHCStF-64 RBD. His-CStF-64RBD was purified using a HiTrap chelating column according to the manufacturer's instructions (Pharmacia). In vitro synthesis of radiolabeled and unlabeled RNAs was performed on linearized plasmid DNA using T7 RNA polymerase in the presence of [32P]UTP, as previously described (38). The radiolabeled RNAs were purified by centrifugation through Sephadex G-50 columns (Pharmacia). UV cross-linking and immunoprecipitation were performed as previously described (38). For immunoprecipitations, we used rabbit anti-hnRNP H peptide antiserum NH114 (generously provided by J. Nikolic and D. Black) or a rabbit anti-cytokeratin 5/6 antiserum (Roche Molecular Biochemicals).

    CAT ELISA. The levels of CAT protein were quantified using a CAT antigen capture enzyme-linked immunosorbent assay (ELISA) (Roche Molecular Biochemicals). All CAT quantitations were normalized to the protein concentration of the cell extract, as determined by the Bradford method.

    Immunohistochemical detection of hnRNP H. Five-micrometer-thick uterine cervical tissue sections, obtained from the National Maternity Hospital in Dublin, Ireland, were cut from formalin-fixed paraffin-embedded tissue blocks and melted onto silane-coated slides at 65°C for 4 h. The sections were then dewaxed in xylene and rehydrated through ethanol to water. Antigen retrieval was performed by immersing tissue sections in 500 ml 0.1 M citrate buffer (pH 6.0) and heating them in a pressure cooker for 12 min. After 20 min in hot citrate buffer, the sections were washed in distilled water and treated with 3% hydrogen peroxide in methanol for 10 min. The sections were rinsed in phosphate-buffered saline (PBS) before being stained using the Vectastain Elite ABC kit (Vector). Normal (rabbit) serum was diluted 1:66 with PBS. Approximately 100 μl of goat polyclonal antibody specific for hnRNP H/H' (clone N-16; Santa Cruz Biotechnologies) at a 1:400 dilution was applied and incubated at room temperature for 1 h. The sections were washed with PBS for 5 min, and the biotinylated secondary antibody (1:200 dilution with PBS) was applied for 15 min. After being washed with PBS, the sections were covered with avidin-biotin complex reagent (1:25 dilution with PBS) for 15 min. Peroxidase labeling was visualized using 0.03% hydrogen peroxide and 0.06% 2,4-diaminobenzidine (Sigma-Aldrich). Mayer's hematoxylin was used as a counterstain, and the sections were then dehydrated in ethanol through xylene and coverslipped using a resinous mountant.

    RESULTS

    Two RNA elements in the HPV-16 L2 coding region are required for early polyadenylation and late mRNA splicing. We have previously shown that the HPV-16 L2 coding region contains RNA elements that inhibit expression of L2 from cDNA expression plasmids (9, 10, 30, 37). We have also shown that mutations that altered the L2 RNA sequence, but not the L2 protein sequence, inactivated the RNA elements and caused high expression of L2 protein from cDNA expression plasmids (30). In order to determine the functions of these elements in the context of the HPV-16 genome, we used a modified version (see Materials and Methods) of the previously described pBEL plasmid (49) named pBEX (Fig. 1B), in which the HPV-16 genome encoding the early region from E1 to the pAE and the entire late region had been cloned downstream of the strong CMV promoter. This plasmid produced high levels of early mRNAs, primarily mRNA 880/3358 (Fig. 1C, CMV probe), but late mRNAs could not be detected (Fig. 1C, L1 probe). We replaced the wild-type L2 gene in pBEX with the mutant L2 gene, producing plasmid pBEX-L2 M (Fig. 1B). Care was taken not to inactivate the 3' splice site at position 5637 when the wt L2 sequence was replaced by the mutant L2 sequence. The 3'-most 100 nt of the mutant L2 sequence were replaced by wt L2 sequences encoding the wt branch point and polypyrimidine tract of the 3' splice site at 5637 (Fig. 1B). pBEX and pBEX-L2 M were transfected into HeLa cells, and total RNA was analyzed by Northern blotting using the L1 probe. Replacing the L2 wild-type gene in pBEX with the L2 mutant sequence, as in pBEX-L2 M, resulted in a significant increase in late mRNA levels (Fig. 1C, L1 probe), strongly suggesting that sequences required for utilization of the early polyadenylation site (pAE) had been destroyed. These results were confirmed by hybridization to the CMV probe that detects the CMV leader sequence that is present on all mRNAs produced by pBEX-derived constructs (Fig. 1C, CMV probe). The mRNA levels produced from the internal-control plasmid pC16EIAV (10) were similar in all transfections (Fig. 1C). Only low levels of the short RNA species polyadenylated at the pAE (early mRNA) were seen with pBEX-L2 M, whereas high levels of early mRNAs were produced by pBEX, as described above (Fig. 1C, CMV probe).

    Next, we replaced the L2 wild-type sequence with the mutant L2 gene in pBEXM, resulting in pBEXM-L2 M (Fig. 1B). The pBEXM plasmid has been shown to produce late mRNAs as a result of the mutational inactivation of exonic splicing silencers (ESS) in the L1 coding region (49). Late mRNAs produced from this plasmid are therefore primarily spliced, representing L1 mRNAs (Fig. 1B and C, L1 probe). In contrast, pBEXM-L2 M, which contains L1 and L2 mutant genes, produced primarily unspliced late mRNAs, representing L2 mRNAs (Fig. 1C, L1 probe). Similarly to pBEX-L2 M, this plasmid did not produce detectable levels of early mRNAs (Fig. 1C, CMV probe). In contrast, pBEXM produced high levels of early mRNAs, as expected (Fig. 1C, CMV probe). Taken together, these results demonstrated that the mutations in L2 caused a switch from early to late gene expression, whereas mutations in L1 induced the expression of spliced late mRNAs without significantly affecting early mRNA production. All experiments were performed on more than three independent occasions, and the results have been reproduced in multiple experiments. A comparison of results obtained with pBEX-L2 M with those previously described for pBELDP (48), in which the pAE sequence AAUAAA had been inactivated by site-directed mutagenesis, revealed two differences. First, approximately 40% of the late mRNAs produced from pBELDP were spliced, whereas the majority of the late mRNAs produced from pBEX-L2 M were unspliced (Fig. 1C, L1 probe). Second, mutational inactivation of the pAE in pBELDP activated upstream cryptic polyadenyation sites, as previously described (48), whereas the mutations in L2, which also inhibited polyadenylation at the pAE, did not (Fig. 1C, CMV probe). Taken together, these results demonstrated that elements necessary for both HPV-16 early polyadenylation and late mRNA splicing are located in the L2 coding region and that these RNA elements had been inactivated by the mutations in the L2 sequence.

    The polyadenylation element in HPV-16 L2 acts independently of splicing. To further study polyadenylation at the HPV-16 pAE, we inserted the entire HPV-16 early 3' UTR, pAE, L2 coding region (nucleotides 4072 to 5656), and pAL downstream of the CAT reporter gene under transcriptional control of the CMV promoter, resulting in the pCATL2 plasmid (Fig. 2A). We also constructed pCATL2 M, in which the L2 wild-type gene was replaced by the L2 mutant sequence (Fig. 2A). These plasmids were transfected into HeLa cells, and total RNA was analyzed by Northern blotting. The complete CAT sequence was used as probe, which allowed simultaneous detection of the short CAT mRNA polyadenylated at the pAE and the long CAT-L2 mRNA polyadenylated at the pAL (Fig. 2A). As seen in Fig. 2B, pCATL2 produced primarily the short CAT mRNAs, demonstrating that the pAE was fully functional and that polyadenylation was efficient in pCATL2. On the other hand, pCATL2 M produced primarily the longer CAT-L2 mRNA polyadenylated at the downstream pAL (Fig. 2B), indicating that the pAE was not recognized. The internal-control plasmid pC16EIAV (10) produced similar levels of equine infectious anemia virus (EIAV) gag mRNA (Fig. 2B). These results confirmed that sequences in the L2 coding region are absolutely required for the function of the pAE. This effect could not be attributed to effects on mRNA half-life by the mutations in L2, since the two plasmids produced similar levels of CAT protein as quantified in a CAT capture ELISA (data not shown). The functional inactivation of the polyadenylation elements in L2 did not affect polyadenylation at the pAL. We concluded that sequences in the HPV-16 L2 coding region are necessary for utilization of the pAE and that these sequences function in the absence of splicing.

    Mapping of downstream polyadenylation elements to the 5' part of the L2 coding region. To map the polyadenylation element in L2, we made hybrids between wild-type and mutant L2 sequences in the pCATL2 plasmid (Fig. 3A). These plasmids were transfected into HeLa cells, and total RNA was analyzed by Northern blotting with the complete CAT sequence used as a probe. By introducing more and more mutant L2 sequences from the 3' end of the gene, as in p#1-410, p#1-299, and p#1-203, the polyadenylation at the pAE became gradually less efficient, with polyadenylation efficiencies of 75%, 65%, and 60%, respectively (Fig. 3B). The mRNA levels produced from the internal-control plasmid pC16EIAV (10) were similar in all transfections (Fig. 3B). As expected, only the long CAT-L2 mRNA was detected when the pAE was mutationally inactivated, as in pCATDP (Fig. 3B). The results revealed that sequences needed for full polyadenylation at the pAE extended even beyond the first 411 nt of the L2 coding region. Less efficient polyadenylation at the pAE was obtained with both p#1-299 and p#1-203, suggesting that multiple polyadenylation elements were present in the 5' end of L2.

    Multiple elements in the HPV-16 L2 coding region are necessary for polyadenylation at the pAE. To map the polyadenylation elements in L2 further, consecutive 3'-end deletions were introduced into the L2 wild-type sequence in the p#1-299 plasmid (Fig. 4A). A low level of polyadenylation was seen with plasmids containing at least 51 nt of L2 (6% in p#51del) (Fig. 4B). Inclusion of L2 sequences down to positions 111 and 163 raised the polyadenylation efficiency to 23% and 38%, respectively (Fig. 4B). However, only plasmids containing 200 nt or more of L2 produced a majority (>59%) of short CAT mRNAs polyadenylated at the pAE (Fig. 4B). These results further supported the conclusion that L2 contains multiple polyadenylation elements and/or a complex RNA secondary structure. However, a major polyadenylation element appeared to be located within the first 203 nucleotides. The presence of these sequences on the mRNA resulted in a polyadenylation efficiency of 60% at the pAE (Fig. 4B).

    Internal deletions identified a polyadenylation enhancer between nucleotide positions 147 and 224 in HPV-16 L2. In order to study the multiple elements in L2 further, short L2 fragments were inserted into the pCATL2 M plasmid using the two unique cloning sites BssHII (position 39 in L2) and ApaI (position 299 in L2) (Fig. 5A). Plasmids were transfected into HeLa cells, and total RNA was analyzed by Northern blotting. Mutations in the first 39 nucleotides of L2 caused a 2.7-fold reduction in polyadenylation at the pAE, as seen when p#224del (62%) and p12 (23%) are compared (Fig. 4B and 5B). Either an element in the immediate 5' end of L2 was destroyed or a secondary structure was altered by the mutations in the first 39 nt of L2. Since all inserted fragments were tested in the background of the mutations in the first 39 nt of L2, the effects of all L2 sequences on polyadenylation in this experiment were weaker than those shown in Fig. 2, 3, and 4. The L2 sequences in p7 (6%), p9 (1.1%), and p10 (1.3%) did not significantly enhance polyadenylation at the pAE (Fig. 5B). The constructs enhancing polyadenylation were p8 (22%), p12 (23%), p14 (39%), and p11 (27%) (Fig. 5B). These results mapped a major polyadenylation element to the 147-to-224 region in L2, which was present in all constructs that displayed significant levels of early polyadenylation. These results are in agreement with those shown in Fig. 4. A comparison between p8 and p14 revealed that sequences between nucleotide positions 224 and 299 in L2 increased polyadenylation from 22% in p8 (147 to 224) to 39% in p14 (147 to 299). Polyadenylation at the pAE in the presence of the first 51 nucleotides of L2 (Fig. 4) and polyadenylation at the pAE in the absence of this sequence (Fig. 5) support the idea that multiple nonoverlapping polyadenylation elements are present in the L2 coding region. Interestingly, the 147-to-224 sequence contains five triple-G motifs, whereas the 224-to-299 region encodes two additional G triplets. We concluded that sequences between 147 and 224 displayed the strongest stimulatory effect on early HPV-16 polyadenylation and that this effect was further enhanced by sequences located between positions 224 and 299. The result suggested that multiple triple-G motifs may be part of the polyadenylation element.

    Mutations in the triple-G motifs in HPV-16 L2 reduced polyadenylation at the early polyadenylation signal. Having established that the sequence between nucleotides 147 and 299 in the 5' end of the L2 coding region promoted polyadenylation at the HPV-16 pAE, we analyzed this sequence further. Visual inspection revealed that an unusually high number of triple-G motifs were located in this region (Fig. 6A). It had been shown previously that G-rich elements downstream of polyadenylation signals, most notably triple-G motifs binding members of the hnRNP H family, may enhance polyadenylation (2, 3). We therefore investigated if triple-G motifs located in the 5' end of HPV-16 L2 were involved in polyadenylation. In an effort to study the effect of the triple-G motifs on polyadenylation, we consecutively mutated the triple-G motifs by replacing the middle G with a T in plasmid p#1-299. This generated plasmids pG1m, pG1-2m, pG1-3m, pG1-4m, pG1-5m, and pG1-6m (Fig. 6A). Plasmid p#1-299 was used because it displayed an intermediate polyadenylation phenotype and any changes in polyadenylation would be detectable. The plasmids were transfected into HeLa cells, and total RNA was analyzed by Northern blotting. As can be seen in Fig. 6B, the polyadenylation efficiency gradually decreased as more G triplets were mutated. Results obtained with G-to-T mutations in all G triplets showed that the polyadenylation efficiency had been lowered from 65% in p#1-299 to 22% in pG1-6m (Fig. 6B). We concluded that the presence of all six triple-G motifs downstream of the pAE was needed for efficient polyadenylation at the HPV-16 pAE and that each of these motifs contributed to the polyadenylation efficiency.

    The GGG motifs constitute the major polyadenylation at the HPV-16 early polyadenylation signal. Downstream polyadenylation elements are often U rich (45). To investigate if a hexa-U stretch at position 4396 in HPV-16 (Fig. 7A) affected polyadenylation at the HPV-16 pAE, deletions and mutations were introduced in p#1-299 (Fig. 7A). We mutated the middle two Us in the hexa-U stretch, resulting in plasmid pPPTm (Fig. 7A). Mutations in the hexa-U region did not affect polyadenylation (Fig. 7B). Deletion of the entire GU-rich sequence between nucleotides 147 and 180, as in pdel30m (Fig. 7A), affected polyadenylation to a small extent, causing a decrease in polyadenylation from 63% to 58% (Fig. 7B). This was most likely a result of the simultaneous deletion of two triple-G motifs. Since most downstream polyadenylation elements are located within 30 nucleotides of the cleavage and polyadenylation sites of a polyadenylation signal, we wished to investigate if the sequence between the pAE and the region encoding the triple-G motifs affected polyadenylation at the pAE. We therefore deleted sequences between the pAE and the first triple-G motif region, generating plasmid p0-FRAGm (Fig. 7A). Transfection experiments revealed that this deletion did not affect the polyadenylation efficiency (Fig. 7B), confirming the importance of the downstream region containing the triple-G motifs.

    The triple-G motifs are required for HPV-16 early polyadenylation signal in a splicing environment. To evaluate the effects of the mutations in the 5' end of the L2 gene in the context of the HPV-16 genome, we introduced an L2 sequence in which the first 299 nucleotides of L2 were derived from the L2 mutant gene into the pBEX and pBEXM constructs, generating pBEX-L2M1-299 and pBEXM-L2M1-299, respectively (Fig. 8A). These plasmids were transfected into HeLa cells, and total RNA was analyzed by Northern blotting with the L1 or CMV probe (Fig. 8A). Both plasmids produced elevated levels of late mRNAs (Fig. 8B, L1 probe). Similar to the results with pBEX-derived plasmids pBEX-L2 M and pBEXM-L2 M containing the entire L2 mutant (Fig. 1C), the late mRNAs produced from pBEX-L2M1-299 were primarily unspliced L2 mRNAs (Fig. 8B, L1 probe). However, in contrast to pBEXM-L2 M, pBEXM-L2M1-299 produced primarily spliced L1 mRNAs (Fig. 8B), demonstrating that a sequence required for late mRNA splicing is located in the 3' end of the L2 region and that this sequence is not mutated in pBEX-L2M1-299.

    We also introduced the triple-G mutations into pBEX and pBEXM, generating pBEX-G6 and pBEXM-G6, respectively (Fig. 8A). Both plasmids produced elevated late mRNA levels (Fig. 8B). In contrast, mutations in the hexa-U stretch (Fig. 8A), as in pBEX-PPT and pBEXM-PPT, did not induce production of late mRNA (Fig. 8B). None of the mutations affected late mRNA levels if the pAE was deleted (Fig. 8C), demonstrating that the mutations affected polyadenylation at the pAE. Early mRNAs (880/3358) were produced by all plasmids (Fig. 8B, CMV probe), indicating that efficient polyadenylation at the pAE occurred. However, the relatively small effect on early mRNA polyadenylation by the mutations in L2 resulted in a significant increase in the production of late mRNAs. These results suggested that polyadenylation at the pAE was under the control of multiple elements and that polyadenylation was not totally dependent on intact triple-G motifs in L2. Deletion of the 30-nucleotide region with the T-rich sequence, as in pBEX-del30 and pBEXM-del30, resulted in an increase in late mRNA levels (Fig. 8B). This result was most likely an effect of the simultaneous deletion of two triple-G motifs. In contrast, deletion of the region upstream of the GGG-containing sequence, as in pBEX-0frag and pBEXM-0frag, did not significantly induce late gene expression (Fig. 8B). The different mutations were also introduced into the pBEX and pBEXM plasmids in the background of a deleted pAE, generating pBEXDP and pBEXMDP (Fig. 8C). No difference was seen in the levels of late transcripts between wild-type L2 sequences and L2 mutants (Fig. 8C), indicating that the mutations in L2 affected the pAE and not RNA instability elements. In conclusion, mutating the first 299 nucleotides of L2 or the triple-G motifs in the 5' end of L2 resulted in decreased polyadenylation at the pAE and activated late gene expression in a splicing context.

    Identification of cellular factors that interact specifically with polyadenylation elements in HPV-16 L2. In order to identify cellular factors that bind to the polyadenylation elements in the 5' end of the L2 coding sequence, wild-type and mutant L2 sequences of different lengths were cloned downstream of the T7 promoter in the pUC19T7 plasmid (49). The wild-type and mutant sequences representing nucleotides 1 to 111, 93 to 168, 147 to 224, and 202 to 293 (Fig. 9A) of L2 were analyzed by UV cross-linking. Wild-type L2 sequences encompassing nucleotides 147 to 224 cross-linked to a 55-kDa factor particularly well (Fig. 9B). The wild-type 93-to-168 L2 sequence cross-linked to a factor of the same size, but with lower efficiency (Fig. 9B). In both cases, the 55-kDa factor cross-linked less efficiently to the corresponding L2 mutant sequence (Fig. 9B). This factor was also present in the cytoplasm in S100 extracts (Fig. 9B). Many cellular factors are present in both nuclear and cytoplasmic fractions as a result of functional roles in both compartments. Weak cross-linking to a protein of the same size was seen with the 1-to-111 and 202-to-293 RNAs, but preference for the wt sequence over the mutant sequence was not observed (Fig. 9B). An 85-kDa protein cross-linked with sequence specificity to the 1-to-111, 93-to-168, and 202-to-293 RNAs but did not appear to bind 147-to-224 RNA (Fig. 9B). Binding of this factor, therefore, did not totally correlate with the function of the L2 sequence in polyadenylation. We concluded that the functionally active 147-to-224 L2 RNA UV cross-linked specifically to a 55-kDa factor, suggesting that this protein was involved in polyadenylation. A 55-kDa factor also interacted specifically with the 93-to-168 fragment (Fig. 9B). If this was the same factor that cross-linked to the 147-to-224 L2 RNA, it suggested that multiple binding sites for the 55-kDa protein were present in L2. A competition assay revealed that the two sequences competed in a concentration-dependent manner for the 55-kDa factor (Fig. 9C), demonstrating that they interacted with the same 55-kDa protein. Interestingly, the number of triple-G motifs in the two different L2 probes correlated with the cross-linking efficiency of the 55-kDa protein, suggesting that the 55-kDa protein recognized triple-G motifs.

    CStF-64 binds specifically to the HPV-16 L2 RNA sequence but is not the major L2 RNA binding protein. It has previously been shown that the HPV-31 L2 sequence contains multiple weak binding sites for CStF-64 and that CStF-64 binds to HPV-31 L2 RNA (42, 43). We therefore performed a number of UV cross-linking experiments to investigate if the CStF-64 protein also interacted with the HPV-16 L2 sequence and if CStF-64 was the 55-kDa factor. Wild-type radiolabeled 147-to-224 RNA was cross-linked to nuclear extract in the absence or presence of cold RNA competitors encoding L2 wild-type 147 to 224, L2 mutant 147 to 224, and an optimal CStF-64 binding site (T7CStF-64) (39) (Fig. 10A). The wild-type 147-to-224 sequence competed to a higher degree than the corresponding mutant sequence, as expected (Fig. 10A). Cold T7CStF-64 RNA did not compete (Fig. 10A), indicating that the vast majority of the 55-kDa band contained factors other than CStF-64. To investigate the potential interaction between CStF-64 and HPV-16 L2 RNA further, radiolabeled T7CStF-64 RNA was UV cross-linked to nuclear extract in the presence of serially diluted cold T7CStF-64 RNA, 147-to-224 L2 wild-type RNA, or 147-to-224 L2 mutant RNA (Fig. 10B). In addition to the efficient competition observed with the T7CStF-64 RNA competitor, the wild-type 147-to-224 L2 RNA competed to a very small extent, whereas the L2 mutant 147 to 224 did not compete at all (Fig. 10B). These results further indicated that the major HPV-16 L2 RNA binding protein was distinct from CStF-64 but that CStF-64 still interacted weakly with HPV-16 L2 RNA in a sequence-specific manner. Finally, recombinant His-CStF64RBD was synthesized and UV cross-linked to T7CStF-64 RNA, L2 wild-type 147-to-224 RNA, and the corresponding 147-to-224 mutant sequence (Fig. 10C). As expected, CStF-64 cross-linked with high efficiency to the CStF-64 RNA (Fig. 10C). CStF-64 also interacted with L2 wild-type 147-to-224 RNA, although much less efficiently (Fig. 10C). This binding was more efficient than the interaction between CStF-64 and the L2 mutant 147-to-224 RNA (Fig. 10C). In contrast, a reversed binding strength was observed with the 55-kDa factor in nuclear extract (Fig. 10C). We concluded that CStF-64 binds weakly but specifically to the L2 147-to-224 wild-type RNA sequence. However, the major fraction of the 55-kDa protein seen cross-linking to HPV-16 L2 RNA is distinct from CStF-64.

    The 55-kDa protein binds to triple-G sequences in the HPV-16 L2 RNA. We next wished to determine where in the 147-to-224 L2 RNA sequence the 55-kDa factor binds. A 30-nucleotide L2 sequence (nt 149 to 178) containing a GU-rich region encoding two of the six triple-G motifs and the hexa-U stretch was inserted downstream of the T7 promoter in the pUC19T7 plasmid, generating pT730G/U (Fig. 11A). Radiolabeled 30G/U, 147-224Wt, and 147-224Mut RNAs were subjected to UV cross-linking to nuclear extract. As seen in Fig. 11B, the 30G/U RNA cross-linked efficiently to the 55-kDa protein. As expected, the 55-kDa protein cross-linked to 147-224Wt but not to 147-224Mut (Fig. 11B).

    We also mutated all five triple-G motifs or the three 3'-most triple Gs in 147-224Wt by replacing the middle G with T, generating pT7G2-6 and pT7G2-4, respectively (Fig. 11A). We also mutated the hexa-U stretch by substituting two Ts for the middle two Gs, producing plasmid pT7PPT (Fig. 11A). Radiolabeled RNAs from pT7PPT, pT7G2-6, pT7G2-4, pT7147-224Wt, and pT7147-224Mut were UV cross-linked to nuclear extract. The hexa-U (PPT) mutant cross-linked to the 55-kDa protein as efficiently as the 147-224Wt sequence (Fig. 11C). In contrast, mutations in all G triplets, as in G2-6, resulted in complete loss of binding of the 55-kDa factor (Fig. 11C). The G2-4 mutant displayed an intermediate phenotype (Fig. 11C), which suggested that more than one G triplet was recognized by the 55-kDa protein. We concluded that the 55-kDa protein binds to the G triplets in the 147-to-224 sequence of the HPV-16 L2 RNA coding region.

    hnRNP H binds to the GGG motifs in the HPV-16 L2 polyadenylation element. Known factors that bind poly(G) stretches include the related hnRNP H, H', and F proteins with molecular masses of 55, 54, and 52 kDa, respectively (19, 27, 28). We therefore investigated if a serum raised against recombinant hnRNP H could immunoprecipitate the 55-kDa HPV-16 L2 RNA binding protein. Radiolabeled RNA from pT730G/U was cross-linked to nuclear extract and subjected to immunoprecipitation with a polyclonal anti-hnRNP H antiserum. The anti-hnRNP H antiserum HN114 (see Materials and Methods) specifically immunoprecipitated the 55-kDa protein that UV cross-linked to HPV-16 L2 RNA (Fig. 11D), whereas the anti-cytokeratin 5/6 antibody did not (Fig. 11D). We concluded that the 55-kDa protein that binds to the G triplets in HPV-16 L2 RNA is hnRNP H.

    Cell differentiation-dependent expression of hnRNP H in cervical epithelium. The results presented above suggested a regulatory role for hnRNP H in the HPV-16 viral life cycle and predicted a cell differentiation-dependent expression pattern of hnRNP H in cervical epithelium. We therefore investigated the hnRNP H expression pattern in apparently normal cervical epithelium. Immunohistochemical detection of hnRNP H revealed that hnRNP H was indeed expressed in the lower layers and suprabasal layers in the cervical squamous epithelium but not in the superficial layers consisting of highly differentiated cells (Fig. 12). These results demonstrated an inverse correlation between cell differentiation and hnRNP H protein production and supported a role for hnRNP H in the induction of late gene expression in the HPV-16 life cycle.

    DISCUSSION

    We have previously shown that the coding regions of HPV-16 L2 and L1 contain RNA elements that have destabilizing effects on the RNA when these genes are expressed under the control of the CMV promoter (10, 30). Since L2 and L1 can be detected only in terminally differentiated epithelial cells infected with HPV, we proposed that these RNA elements regulate late gene expression in a differentiation-dependent manner in the viral life cycle (32-34). Early in infection, factors would bind to the elements and destabilize the RNA. In contrast, late in infection, these factors would not bind, and hence, the mRNAs would be stable and L1 and L2 proteins would be produced. Alternatively, we suggested that these elements bind cellular factors that cannot execute their functions when the viral late RNAs are not expressed in the context of the whole genome (32-34). Such factors could be polyadenylation and/or splicing factors. Indeed, we have since shown that the inhibitory element in the HPV-16 L1 coding region encodes an ESS that binds hnRNP A1 (49). In the present study, we provide data that demonstrate that the inhibitory region in the 5' end of HPV-16 L2 encodes elements that promote polyadenylation at the pAE and that they interact with hnRNP H. We also show that sequences in the HPV-16 L2 coding region are necessary for splicing of the late mRNAs. Therefore, intragenic splicing and polyadenylation elements may inhibit expression of cDNAs in mammalian cells. It is not unlikely that many cDNAs that have been "codon optimized" in order to increase protein production, including early HPV genes (8, 14, 25), contain various regulatory RNA elements that have been destroyed by the mutations.

    A phylogenetic comparison of the 5' ends of L2 in different members of the papillomavirus family showed that the pAE has the characteristics of a weak polyadenylation site. A weak polyadenylation site has a CPSF-interacting site variant other than AAUAAA and/or a suboptimal G/U-rich tract for CStF binding (45). When aligning HPV, BPV, and other mammalian PV sequences from the region downstream of the pAE site, GU-rich sequences at the reported optimal distance of 20 to 70 nucleotides downstream of the polyadenylation site were generally not found. However, according to the results presented here, the G-rich region encoding multiple triple-G motifs located more than 174 nucleotides downstream of the pAE is a major polyadenylation element in HPV-16. Interestingly, GU-rich regions with multiple poly(G) tracts were found in almost all mammalian papillomaviruses (data not shown). In a database search of different mammalian polyadenylation sites, 34% of the downstream regions were found to contain short tracts of G residues (2), suggesting that polyadenylation of cellular mRNAs may be regulated by downstream poly(G) motifs and hnRNP H.

    The phylogenetically closest relative of HPV-16 is HPV-31. Terhune et al. (42, 43) studied the L2 region downstream of the pAE in HPV-31 and found that 800 nucleotides following the pAE site are required for full polyadenylation efficiency. Similarly, we found that polyadenylation at the HPV-16 pAE requires at least 400 nucleotides of downstream sequences derived from the L2 coding region. Therefore, transcription continues far into the L2 region before polyadenylation at the pAE occurs (5, 42, 43). If the RNA polymerase reaches the pAL, inhibitory sequences are located in the HPV-16 late 3' UTR (22-24, 40). Such sequences have been found in multiple papillomaviruses, including HPV-1 (36, 41, 46), HPV-31 (11), and BPV-1 (16), and may act by preventing untimely late gene expression. These elements may act in a differentiation-dependent manner (29). Additional evidence for continued transcription is the detection of spliced L1 transcript upon transfection of the pBEXM plasmid. Splicing from the E4 splice donor (SD) (nt 3652) to the L1 splice acceptor (SA) (nt 5637) is mutually exclusive with polyadenylation at the pAE (nt 4215). The mutational inactivation of ESS in the L1 coding region (49), more than 1.5 kb downstream of the pAE, increased the competitiveness of the L1 3' splice site toward the pAE, further demonstrating that transcription continues far into the late region (Fig. 1). It has recently been shown that transient complexes form and interact on coupled splice and polyadenylation substrates (47). The polyadenylation complex and the last 3' splice acceptor complex (Ac) fuse to form the mature Bc complex coupling polyadenylation and last-intron removal, well before splicing and polyadenylation products are detected. If such a complex formed between the E4 3' splice site and the pAE in HPV-16, it would compete with complexes forming between the 5' splice sites in the early region of HPV-16 and the late 3' splice site immediately upstream of L1. Combined, these results demonstrate that HPV early polyadenylation signals are parts of complex regulatory networks in which competition between polyadenylation signals and splice sites regulates HPV gene expression in a differentiation-dependent manner during the viral life cycle.

    Analysis of proteins binding to the HPV-31 L2 RNA revealed that it contains multiple weak binding sites for CStF-64 (43). The investigators speculated that binding of CStF-64 to these sites correlated with the polyadenylation efficiency at the pAE (43). They also showed that some subunits of the CStF complex are down regulated in keratinocytes in response to differentiation induced by suspension in semisolid medium (43). In agreement with these results, we find that CStF-64 binds preferentially to the HPV-16 L2 RNA sequence that is required for polyadenylation but not to the mutant L2 sequence that fails to support polyadenylation. However, we were unable to determine if binding of CStF-64 to the L2 RNA sequence is involved in polyadenylation of HPV-16. The change in relative levels of CStF has been proposed to affect polyadenylation site usage (43). One may speculate that hnRNP H may act cooperatively to promote polyadenylation to the pAE in HPV-16 and HPV-31. It would be interesting to determine if these two factors interact to promote polyadenlation.

    Normally, downstream polyadenylation elements are located within 20 to 70 nucleotides of the polyadenylation signal (45). In contrast, the triple-G motifs in HPV-16 are located 174 and more nucleotides downstream of the pAE. To investigate if the sequence between the pAE and the triple-G motifs is important in polyadenylation, we deleted the sequence between the HPV-16 pAE and L2 nucleotide position 147 (Fig. 7). This resulted in a polyadenylation phenotype similar to that of the full-length wild-type construct pCATL2 (Fig. 7), suggesting that the first 147 nucleotides downstream of the pAE are not important for polyadenylation. This region did not contain a classical U-rich downstream element. The triple-G motifs were found in a relatively short sequence between positions 147 and 299 in L2 (Fig. 11). However, when L2 sequences in the p#1-299 construct had been mutated so that only the 5'-most 300 nucleotides were wild type, the polyadenylation efficiency was 65%. In comparison, the full-length L2 gene in pCATL2 gave 100% efficiency of the pAE. These results showed that L2 sequences extending far into L2 were needed, suggesting that the secondary structure of the RNA may be important. Results for the human T-cell leukemia virus type 1 polyadenylation signal have demonstrated that a secondary structure is of major importance in human T-cell leukemia virus type 1 to bring the AAUAAA hexamer 276 nucleotides upstream into close proximity to the polyadenylation site (1, 6). In several studies, the secondary structure of the region surrounding the polyadenylation signal has been suggested to affect polyadenylation (17, 18, 31). Interestingly, when the triple-G motifs in HPV-16 L2 were consecutively mutated, we detected a gradual decrease in polyadenylation efficiency. It has previously been reported that binding of hnRNP H to G-rich motifs stimulates simian virus 40 polyadenylation (3). It has also been suggested that the G-rich hnRNP H binding site influences the RNA secondary structure of the simian virus 40 late polyadenylation signal (18). Arhin et al. showed that short G-rich tracts downstream of polyadenylation sites bind hnRNP H and that this binding stimulates polyadenylation both in vivo and in vitro (2). They suggested that G-rich elements folded into suboptimal conformations and that hnRNP H binding to the elements altered its structure. Alternatively, hnRNP H interacts directly with the polyadenylation machinery and stimulates CPSF and/or CStF binding and complex assembly (44). The requirement for a large portion of the HPV-16 L2 RNA sequence, together with the additive effect of each triple-G motif for polyadenylation at the HPV-16 pAE, argues for a model in which a certain secondary structure of the 5' end of the L2 RNA is required to present the hnRNP H, and perhaps CStF-64, binding sites in an optimal manner.

    There are a number of related hnRNP H proteins expressed in HeLa cells, including hnRNP H, hnRNP H', hnRNP F, and 2H9 (19, 27, 28). The molecular masses of hnRNP H, H', and F are between 50 and 60 kDa, whereas that of 2H9 is around 35 kDa. Analysis of the binding specificities of the various hnRNP H-related proteins revealed that all interacted with GGGA, whereas only hnRNP H and H' bound GGGC (7). HnRNP H has also been found to bind to GGGU (15). Here, five of six triple-G motifs in the 5' end of HPV-16 L2 are of the GGGU type and one is GGGA (Fig. 6). Taken together, these results suggest that hnRNP H and/or hnRNP H' is the major HPV-16 L2 RNA-interacting factor, whereas hnRNP F and 2H9 are less likely to bind HPV-16 L2 RNA. Staining of cervical epithelium with hnRNP H/H'-specific antiserum detected hnRNP H in the basal and suprabasal layers but not in the upper, more differentiated layers, demonstrating an inverse correlation between cell differentiation and levels of hnRNP H protein. These results support a model in which hnRNP H binds to the triple-G motifs and promotes polyadenylation at the pAE in the lower layers of the epithelium in the early stage of the viral life cycle. As the infected cell differentiates, production of hnRNP H is down regulated, resulting in less efficient polyadenylation at the pAE and read-through into the late region. This initiates the late, productive stage of the viral life cycle, which is defined by production of the late mRNAs and capsid proteins. Production of the major capsid protein L1 also requires that a fraction of the late mRNAs splice. This is regulated by a previously described hnRNP A1-dependent splicing silencer in the L1 coding region that is highly active in the early stage of the viral life cycle to prevent premature L1 production through direct splicing into the late region (49). However, as differentiation proceeds, the activity of this splicing silencer must decrease to allow splicing of the late mRNAs, thereby causing an optimal ratio of L2 to L1 mRNAs and efficient production of viral particles. We conclude that multiple factors must be required for polyadenylation at the HPV-16 early polyadenylation signal (48) and that hnRNP H plays a regulatory role in the differentiation-dependent induction of late gene expression in the viral life cycle.

    ACKNOWLEDGMENTS

    We thank J. Nikolic and D. Black for hnRNP H antiserum, C. Milcarek for the GST-CStF-64 plasmid, C. ?hrmalm for advice on immunoprecipitation, M. Rush and X. Zhao for materials and discussions, and members of the Akusj?rvi laboratory for discussion.

    J.F. holds a postgraduate scholarship from the Dublin Institute of Technology. The research was sponsored by grants from the Swedish Research Council and the Swedish Cancer Society and by a basic research grant from the Dublin Institute of Technology.

    REFERENCES

    Ahmed, Y. F., G. M. Gilmartin, S. M. Hanly, J. R. Nevins, and W. C. Greene. 1991. The HTLV-I Rex response element mediates a novel form of mRNA polyadenylation. Cell 64:727-737.

    Arhin, G. K., M. Boots, P. S. Bagga, C. Milcarek, and J. Wilusz. 2002. Downstream sequence elements with different affinities for the hnRNP H/H' protein influence the processing efficiency of mammalian polyadenylation signals. Nucleic Acids Res. 30:1842-1850.

    Bagga, P. S., G. K. Arhin, and J. Wilusz. 1998. DSEF-1 is a member of the hnRNP H family of RNA-binding proteins and stimulates pre-mRNA cleavage and polyadenylation in vitro. Nucleic Acids Res. 26:5343-5350.

    Baker, C. C. 1997. Posttranscriptional regulation of papillomavirus gene expression. In S. R. Billakanti, C. E. Calef, A. D. Farmer, A. L. Halpern, and G. L. Myers (ed.), Human papillomaviruses: a compilation and analysis of nucleic acid and amino acid sequences. Theoretical biology and biophysics. Los Alamos National Laboratory, Los Alamos, N. Mex.

    Baker, C. C., and J. S. Noe. 1989. Transcriptional termination between bovine papillomavirus type 1 (BPV-1) early and late polyadenylation sites blocks late transcription in BPV-1-transformed cells. J. Virol. 63:3529-3534.

    Bar-Shira, A., A. Panet, and A. Honigman. 1991. An RNA secondary structure juxtaposes two remote genetic signals for human T-cell leukemia virus type I RNA 3'-end processing. J. Virol. 65:5165-5173.

    Caputi, M., and A. M. Zahler. 2001. Determination of the RNA binding specificity of the heterogeneous nuclear ribonucleoprotein (hnRNP) H/H'/F/2H9 family. J. Biol. Chem. 276:43850-43859.

    Cid-Arregui, A., V. Juarez, and H. zur Hausen. 2003. A synthetic E7 gene of human papillomavirus type 16 that yields enhanced expression of the protein in mammalian cells and is useful for DNA immunization studies. J. Virol. 77:4928-4937.

    Collier, B., L. Goobar-Larsson, M. Sokolowski, and S. Schwartz. 1998. Translational inhibition in vitro of human papillomavirus type 16 L2 mRNA mediated through interaction with heterogenous ribonucleoprotein K and poly(rC)-binding proteins 1 and 2. J. Biol. Chem. 273:22648-22656.

    Collier, B., D. Oberg, X. Zhao, and S. Schwartz. 2002. Specific inactivation of inhibitory sequences in the 5' end of the human papillomavirus type 16 L1 open reading frame results in production of high levels of L1 protein in human epithelial cells. J. Virol. 76:2739-2752.

    Cumming, S. A., C. E. Repellin, M. McPhillips, J. C. Radford, J. B. Clements, and S. V. Graham. 2002. The human papillomavirus type 31 late 3' untranslated region contains a complex bipartite negative regulatory element. J. Virol. 76:5993-6003.

    deVilliers, E. M., C. Fauquet, T. R. Broker, H. U. Bernard, and H. zur Hausen. 2004. Classification of papillomaviruses. Virology 324:17-27.

    Dignam, J. D., R. M. Lebovitz, and R. G. Roeder. 1983. Accurate transcription initiation by RNA polymerase II in a soluble extract from isolated mammalian nuclei. Nucleic Acids Res. 11:1475-1489.

    Disbrow, G. L., I. Sunitha, C. C. Baker, J. Hanover, and R. Schlegel. 2003. Codon optimisation of the HPV-16 E5 gene enhances protein expression. Virology 311:105-114.

    Fogel, B. L., L. M. McNally, and M. T. McNally. 2002. Efficient polyadenylation of Rous sarcoma virus RNA requires the negative regulator of splicing element. Nucleic Acids Res. 30:810-817.

    Furth, P. A., and C. C. Baker. 1991. An element in the bovine papillomavirus late 3' untranslated region reduces polyadenylated cytoplasmic RNA levels. J. Virol. 65:5806-5812.

    Gilmartin, G. M., E. S. Fleming, and J. Oetjen. 1992. Activation of HIV-1 pre-mRNA 3' processing in vitro requires both an upstream element and TAR. EMBO J. 11:4419-4428.

    Hans, H., and J. C. Alwine. 2000. Functionally significant secondary structure of the simian virus 40 late polyadenylation signal. Mol. Cell. Biol. 20:2926-2932.

    Honore, B., H. H. Rasmussen, H. Vorum, K. Dejgaard, X. Liu, P. Gromov, P. Madsen, B. Gesser, N. Tommerup, and J. E. Celis. 1995. Heterogeneous nuclear ribonucleoproteins H, H', and F are members of a ubiquitously expressed subfamily of related but distinct proteins encoded by genes mapping to different chromosomes. J. Biol. Chem. 270:28780-28789.

    Howley, P. M. 1996. Papillomavirinae: the viruses and their replication, p. 2045-2076. In B. N. Fields, D. M. Knipe, and P. M. Howley (ed.), Fields virology, 3rd ed., vol. 2. Lippincott-Raven Publishers, Philadelphia, Pa.

    Kaufmann, I., G. Martin, A. Friedlein, H. Langen, and W. Keller. 2004. Human Fip1 is a subunit of CPSF that binds to U-rich elements and stimulates poly(A) polymerase. EMBO J. 23:616-626.

    Kennedy, I. M., J. K. Haddow, and J. B. Clements. 1990. Analysis of human papillomavirus type 16 late mRNA 3' processing signals in vitro and in vivo. J. Virol. 64:1825-1829.

    Kennedy, I. M., J. K. Haddow, and J. B. Clements. 1991. A negative regulatory element in the human papillomavirus type 16 genome acts at the level of late mRNA stability. J. Virol. 65:2093-2097.

    Koffa, M. D., S. V. Graham, Y. Takagaki, J. L. Manley, and J. B. Clements. 2000. The human papillomavirus type 16 negative regulatory RNA element interacts with three proteins that act at different posttranscriptional levels. Proc. Natl. Acad. Sci. USA 97:4677-4682.

    Liu, W. J., F. Gao, K. N. Zhao, G. J. Fernando, R. Thomas, and I. H. Frazer. 2002. Codon modified human papillomvirus type 16 E7 DNA vaccine enhances cytotoxic T-lymphocyte induction and anti-tumour activity. Virology 301:43-52.

    Longworth, M. S., and L. A. Laimins. 2004. Pathogenesis of human papillomaviruses in differentiating epithelia. Microbiol. Mol. Biol. Rev. 68:362-372.

    Mahe, D., P. Mahl, R. Gattoni, N. Fisher, M. G. Mattei, J. Stevenin, and J. P. Fuchs. 1997. Cloning of human 2H9 heterogeneous nuclear ribonucleoproteins. Relation with splicing and early heat shock-induced splicing arrest. J. Biol. Chem. 272:1827-1836.

    Matunis, M. J., J. Xing, and G. Dreyfuss. 1994. The hnRNP F protein: unique primary structure, nucleic acid-binding properties, and subcellular localization. Nucleic Acids Res. 22:1059-1067.

    McPhillips, M. G., T. Veerapraditsin, S. A. Cumming, D. Karali, S. G. Milligan, W. Boner, I. M. Morgan, and S. V. Graham. 2004. SF2/ASF binds the human papillomavirus type 16 late RNA control element and is regulated during differentiation of virus-infected epithelial cells. J. Virol. 78:10598-10605.

    ?berg, D., B. Collier, X. Zhao, and S. Schwartz. 2003. Mutational inactivation of two distinct negative RNA elements in the human papillomavirus type 16 L2 coding region induces production of high levels of L2 in human cells. J. Virol. 77:11674-11684.

    Phillips, C., C. B. Kyriakopoulou, and A. Virtanen. 1999. Identification of a stem-loop structure important for polyadenylation at the murine IgM secretory poly(A) site. Nucleic Acids Res. 27:429-438.

    Schwartz, S. 1998. cis-acting negative RNA elements on papillomavirus late mRNAs. Semin. Virol. 8:291-300.

    Schwartz, S. 2000. Regulation of human papillomavirus late gene expression. Ups. J. Med. Sci. 105:171-192.

    Schwartz, S., X. Zhao, D. ?berg, and M. Rush. 2004. Regulation of papillomavirus late gene expression. Recent Res. Dev. Virol. 6:29-45.

    Shah, K. V., and P. M. Howley. 1996. Papillomaviruses, p. 2077-2109. In B. N. Fields, D. M. Knipe, and P. M. Howley (ed.), Fields virology, 3rd ed., vol. 2. Lippincott-Raven Publishers, Philadelphia, Pa.

    Sokolowski, M., H. Furneaux, and S. Schwartz. 1999. The inhibitory activity of the AU-rich RNA element in the human papillomavirus type 1 late 3' untranslated region correlates with its affinity for the elav-like HuR protein. J. Virol. 73:1080-1091.

    Sokolowski, M., W. Tan, M. Jellne, and S. Schwartz. 1998. mRNA instability elements in the human papillomavirus type 16 L2 coding region. J. Virol. 72:1504-1515.

    Sp?ngberg, K., L. Wiklund, and S. Schwartz. 2000. HuR, a protein implicated in oncogene and growth factor mRNA decay, binds to the 3' ends of hepatitis C virus RNA of both polarities. Virology 274:378-390.

    Takagaki, Y., and J. L. Manley. 1997. RNA recognition by the human polyadenylation factor CstF. Mol. Cell. Biol. 17:3907-3914.

    Tan, W., B. K. Felber, A. S. Zolotukhin, G. N. Pavlakis, and S. Schwartz. 1995. Efficient expression of the human papillomavirus type 16 L1 protein in epithelial cells by using Rev and the Rev-responsive element of human immunodeficiency virus or the cis-acting transactivation element of simian retrovirus type 1. J. Virol. 69:5607-5620.

    Tan, W., and S. Schwartz. 1995. The Rev protein of human immunodeficiency virus type 1 counteracts the effect of an AU-rich negative element in the human papillomavirus type 1 late 3' untranslated region. J. Virol. 69:2932-2945.

    Terhune, S. S., W. G. Hubert, J. T. Thomas, and L. A. Laimins. 2001. Early polyadenylation signals of human papillomavirus type 31 negatively regulate capsid gene expression. J. Virol. 75:8147-8157.

    Terhune, S. S., C. Milcarek, and L. A. Laimins. 1999. Regulation of human papillomavirus type 31 polyadenylation during the differentiation-dependent life cycle. J. Virol. 73:7185-7192.

    Veraldi, K. L., G. K. Arhin, K. Martincic, L. H. Chung-Ganster, J. Wilusz, and C. Milcarek. 2001. hnRNP F influences binding of a 64-kilodalton subunit of cleavage stimulation factor to mRNA precursors in mouse B cells. Mol. Cell. Biol. 21:1228-1238.

    Wahle, E., and U. Ruegsegger. 1999. 3'-End processing of pre-mRNA in eukaryotes. FEMS Microbiol. Rev. 23:277-295.

    Wiklund, L., M. Sokolowski, A. Carlsson, M. Rush, and S. Schwartz. 2002. Inhibition of translation by UAUUUAU and UAUUUUUAU motifs of the AU-rich RNA instability element in the HPV-1 late 3' untranslated region. J. Biol. Chem. 277:40462-40471.

    Wu, C., and J. C. Alwine. 2004. Secondary structure as a functional feature in the downstream region of mammalian polyadenylation signals. Mol. Cell. Biol. 24:2789-2796.

    Zhao, X., D. ?berg, M. Rush, and S. Schwartz. 2005. A 57 nucleotide upstream early polyadenylation element in human papillomavirus type 16 interacts with hFip1, CstF-64, hnRNP C1/C2, and PTB. J. Virol. 79:4270-4288.

    Zhao, X., M. Rush, and S. Schwartz. 2004. Identification of an hnRNP A1-dependent splicing silencer in the human papillomavirus type 16 L1 coding region that prevents premature expression of the late L1 gene. J. Virol. 78:10888-10905.

    zur Hausen, H. 2002. Papillomaviruses and cancer: from basic studies to clinical application. Nat. Rev. Cancer 2:342-350.

    zur Hausen, H., and E. M. de Villiers. 1994. Human papillomaviruses. Annu. Rev. Microbiol. 48:427-447.(Daniel ?berg, Joanna Fay,)