当前位置: 首页 > 医学版 > 期刊论文 > 临床医学 > 微生物临床杂志 > 2005年 > 第10期 > 正文
编号:11258886
Use of Sequence Data Generated in the Bayer TruGene Genotyping Assay To Recognize and Characterize Non-Subtype-B Human Immunodeficiency Viru
     Department of Laboratory Medicine and Pathology, Hennepin County Medical Center, Minneapolis, Minnesota 55415

    Department of Laboratory Medicine and Pathology, University of Minnesota Medical School, Minneapolis, Minnesota 55455

    ABSTRACT

    Human immunodeficiency virus type 1 (HIV-1) protease (PR) and reverse transcriptase (RT) gene sequences obtained during antiretroviral resistance testing with a commercial genotyping assay (TruGene; Bayer Corp.) were analyzed to assess the utility of these data for detecting and characterizing non-subtype-B HIV-1 strains. A total of 125 viral sequences obtained from patients believed to have acquired their HIV-1 infection in Africa were analyzed, of which 121 were determined to belong to non-B subtypes. Utilizing TruGene sequence data alone, 92 (76%) of these viruses could be subtyped by conventional phylogenetic analysis. The addition of supplemental RT sequence data enabled a further 28 (23.1%) viruses to be classified, while one (0.9%) sample could not be classified conclusively. Two internet-accessible databases that generate HIV-1 subtypes from PR and RT sequences (HIV-SEQ and Geno2Pheno) were also evaluated, and both achieved 88% concordance (106/120) with phylogenetic analysis. Non-subtype-B and B-subtype HIV-1 sequences could be readily discriminated by tallying silent polymorphisms listed on the TruGene research report. The mean number of silent polymorphisms in the non-B HIV-1 sequences identified in this study was 58.3 (95% confidence interval [CI], 41.1 to 75.5), compared with 20.7 (95% CI, 9.9 to 31.5) for the four subtype B viruses in the study cohort and 118 case-matched B-subtype controls. Sequence data generated in the TruGene HIV-1 genotyping assay could, therefore, provide a ready means of tracking the prevalence and identity of non-B subtypes in HIV-1-infected populations undergoing routine antiretroviral resistance testing.

    INTRODUCTION

    One of the hallmarks of the human immunodeficiency virus type 1 (HIV-1) is its enormous genetic diversity, a consequence of high rates of mutation and recombination during viral replication, necessitating the development of an increasingly complex scheme for viral classification (30, 31, 35). The major (M) group of HIV-1 strains, responsible for the overwhelming majority of infections, currently contains nine recognized subtypes (A to D, F to H, J, and K), two of which, A and F, are further divided into sub-subtypes (A1 and A2 and F1 and F2) (31). The circulation of multiple subtypes in certain geographic regions has led to further diversity since it has resulted in a subset of patients in these locales becoming simultaneously infected with viruses belonging to different subtypes, with subsequent intersubtype recombination events generating unique chimeric viruses (36). The epidemiologic importance of these viruses has been demonstrated in several studies conducted in sub-Saharan Africa, where unique chimeras have been shown to be present in 30% or more of HIV-1-infected individuals (10). Finally, certain specific intersubtype recombinant viruses have been found in multiple epidemiologically unrelated individuals, indicating the establishment of entirely novel recombinant subtypes termed circulating recombinant forms (CRFs), of which 16 have been formally recognized to date (31, 36).

    The impact of the bewildering genetic variability of HIV-1 on therapeutic response and antiretroviral resistance development remains uncertain (11, 33, 34). This lack of knowledge is due both to the comparatively limited use of antiretroviral drugs in those areas of the world where HIV-1 diversity is greatest and to the overwhelming predominance of a single subtype, namely B, in those areas of the world where antiretroviral use is heaviest (33). Conflicting reports have been published on the relative rates of de novo primary resistance mutations to antiretroviral drugs in B versus non-B subtypes (2, 3, 8, 9, 22, 29), and insufficient data are currently available to definitively determine whether the genetic background of a viral isolate (as indicated by subtype) influences the phenotypic significance of resistance-associated mutations (11, 22, 33). Given both burgeoning efforts to implement antiretroviral treatment programs in the developing world, such as the WHO "3 million by 2005" initiative (14), and the emerging significance of non-B HIV-1 subtypes in the developed world (7, 21, 27, 32), it would seem prudent to perform viral subtype surveillance in patients prior to commencing antiretroviral therapy in order to better elucidate relationships between viral subtype and therapeutic outcome.

    Phylogenetic analysis of nucleotide sequences of one or more regions of the HIV-1 genome remains the primary approach for determining viral subtype (31). Definitive assignation of a viral isolate to a particular subtype or CRF and optimal identification of unique chimeras require sequencing of multiple regions of the genome, a costly and laborious exercise. Although such extensive evaluation is clearly necessary in investigations seeking either to thoroughly assess viral diversity in a given population or to determine local and global patterns of viral transmission, it may not be required if the scope of the investigation is more limited, such as investigating the influence of subtype on therapeutic outcome. With the exception of the recently licensed fusion inhibitor enfuvirtide (16), the target site for all antiretroviral agents is either the protease (PR) or reverse transcriptase (RT) of HIV-1, proteins encoded by contiguous genes within the pol region of the viral genome. Determination of resistance to these agents is primarily accomplished by RT-PCR amplification of circulating viral RNA followed by nucleotide sequencing to detect the presence of resistance-conferring mutations, a technique commonly referred to as HIV genotyping. To assess the influence of genetic variation on parameters of therapeutic effectiveness, it would seem reasonable to determine subtype using the genomic region targeted by antiretroviral agents, and thus the sequence information generated by commercial HIV genotyping systems, such as the ViroSeq HIV-1 genotyping system (Celera Diagnostics) and the TruGene HIV-1 genotyping test (Bayer Corp.), seems ideally suited for this purpose. Preliminary evaluations of this approach with the ViroSeq system have been encouraging, although determination of subtype in all studies to date has been accomplished using complicated, time-consuming, phylogenetic analysis of data (4, 13). Interestingly, however, a number of web-based algorithms are now available that offer a real-time determination of viral subtype from PR and RT. HIV-SEQ is one of several applications available at the Stanford University HIV Drug Resistance Database (http://hivdb.stanford.edu) and includes an algorithm for assigning subtypes that involves determining comparative nucleotide similarity scores between submitted sequences and a panel of reference sequences representative of the main subtypes and CRFs of group M HIV-1 strains (18). Geno2Pheno (http://195.37.60.133/cgi-bin/geno2pheno.pl) is an application developed by Genafor (Bonn, Germany) ostensibly for predicting phenotypic resistance to antiretroviral agents from genotypic data using decision tree analysis (6); however, a supplementary algorithm simultaneously performs a BLAST-type comparison of submitted sequences with a database containing several representative sequences of each subtype and CRF. Utilization of these services requires comparatively little effort by the end user; HIV-SEQ will accept sequence files directly from the TruGene system, obviating even the need to convert them into file formats compatible with FASTA/BLAST analyses (28), and no knowledge of phylogenetic analysis is needed. If such simplified analyses can provide a reasonably accurate determination of subtype, it offers the possibility that most, if not all, laboratories performing HIV genotyping could determine viral subtype relatively routinely and, ultimately, that algorithmic approaches similar to those currently offered by third parties could be incorporated into future iterations of the sequence analysis software provided by the genotyping assay manufacturers.

    The present study was undertaken both to establish the utility of sequence data generated by the TruGene HIV-1 genotyping system for determining viral subtype using conventional phylogenetic analysis and to examine the accuracy of web-based algorithms for elucidating subtypes from the same data. In addition, information provided in the TruGene genotyping report was scrutinized in order to determine whether some simple screening metric could be used to demarcate B from non-B viruses prior to subjecting sequences to further interrogation.

    MATERIALS AND METHODS

    Study institution and patients. Hennepin County Medical Center is a 450-bed public teaching facility located in Minneapolis, Minnesota. The institution averages 20,000 patient admissions and 400,000 outpatient clinic and emergency room visits annually and provides continuing care for approximately 1,500 persons with HIV infection. Individuals who acquired their infection in Africa and are thus highly likely to be infected with a non-B HIV-1 subtype have become an increasingly significant component of those receiving care in the HIV clinic at Hennepin County Medical Center, comprising approximately 15% of the current patient population (1). From a database maintained in the HIV clinic, a total of 125 patients were identified who were believed to have acquired their HIV infection in Africa and had either had a TruGene HIV genotype test performed (n = 95) or had an archived plasma sample available (n = 30) that contained at least 50 copies of HIV RNA/ml. Additional demographic, clinical, and virologic characteristics of this cohort of patients have been described elsewhere (1).

    HIV-1 genotyping. HIV-1 PR and RT sequences were generated from archived plasma samples using the TruGene HIV-1 genotyping assay (v1.5) (Bayer Diagnostics) essentially as recommended by the manufacturer. The only significant modification was the addition of an initial viral concentration step on samples with HIV RNA concentrations of 50 to 1,000 copies/ml in which 1 ml of plasma was centrifuged at 28,000 x g for 60 min at 4°C and 860 μl of supernatant was removed prior to resuspension of the resultant pellet. The TruGene HIV-1 genotyping assay consists of an RT reaction performed on HIV RNA extracted from plasma followed by PCR amplification of a 1,318-bp fragment of the pol gene. Cycle sequencing of two noncontiguous regions of this fragment is then performed, generating a 288-bp sequence of the PR gene (codons 4 to 99) and a 630-bp sequence of the RT gene (codons 36 to 247). The OpenGene (Bayer Diagnostics) software system edits and aligns the resultant data, generates a protein translation of the sequence, and determines the presence of antiretroviral resistance-conferring polymorphisms (24).

    Phylogenetic analysis of sequence data. PR and RT sequences generated using the TruGene HIV genotyping assay were converted from the native format of the Unix-based OpenGene program (.glc files) to FASTA-formatted text files using the Exporter utility and then exported to a desktop personal computer running the Windows operating system. Since the TruGene system generates noncontiguous sequences for the protease and RT genes, sequences for each gene were independently aligned with 59 representative sequences of HIV-1 group M subtypes and CRFs available from the Los Alamos database (http://www.hiv.lanl.gov/content/hiv-db) using BioEdit v. 5.0.9. Sequences used for alignment were as follows: A1-Q23-17 (AF004885), A1-SE7253 (AF069670), A1-U455 (M62320), A1-UG037 (U51190), A2-97CDKFE4 (AF286240), A2-97CDKS10 (AF286241), A2-94CY017-41 (AF286237), B-HXB2 (K03455), B-RF (M17451), B-JRFL (U63632), B-WEAU160 (U211135), C-92BR025 (U52953), C-96BW0502 (AF110967), C-ETH2220 (U46016), C-99ET3 (AY255827), C-SM145 (AF447850), C-95IN21068 (AF067155), D-ELI (K03454), D-NDK (M27323), D-84ZR085 (U88822), D-94UG114 (U88824), F1-VI850 (AF077336), F1-93BR020.1 (AF005494), F1-FIN9363 (AF07503), F1-MP411 (AJ249238), F2-MP255 (AJ249236), F2-MP257 (AJ249237), G-DRCBL (AF084936), G-92NG083 (U88826), G-SE6165 (AF061642), H-VI991 (AF190127), H-VI997 (AF190128), H-90CF056 (AF005496), J-SE7887 (AF082394), J-SE7022 (AF082395), K-EQTB11C (AJ249235), K-MP535 (AJ249239), CRF01_AE-90CF11697 (AF197340), CRF01_AE-90CF402 (U51188), CRF01_AE-90CF4071 (AF197341), CRF01_AE-CM240 (U54771), CRF02_AG-LB12 (AF212281), CRF02_AG-LBPOC44951 (AF447831), CRF02_AG-97CM-MP807 (AJ286133), CRF02_AG-DJ264 (AF063224), CRF02_AG-IBNG (L39106), CRF02_AG-SE7812 (AF107770), CRF03_AB-KAL153-2 (AF193276), CRF04_cpx-CY032 (AF049337), CRF05_DF-VI1310 (AF193253), CRF06_cpx-BFP90 (AF064669), CRF07_BC-97CN001 (AF286226), CRF08_BC-97CNGX-7F (AY008716), CRF10_CD-96TZ-BF061 (AF289548), CRF11_cpx-MP818 (AJ291718), CRF12_BF-ARMA159 (AF385936), CRF13_cpx-1849 (AF460972), CRF14_BG-X475 (AF423759), SIV-CPZGAB (X52154). Phylogenetic distances were then elucidated by constructing neighbor-joining trees based on Kimura's two-parameter matrix, and the robustness of these relationships was tested by the bootstrap method using 100 replications. These analyses were conducted using applications provided in the MEGA 2.1 software suite (www.megasoftware.net) (23). The simian immunodeficiency virus sequence CPZGAB was used as the outgroup in all phylogenetic trees. Determination of subtype or CRF was considered definitive only if the bootstrap value linking a given sample exceeded 70% for both gene regions analyzed. Samples that could not be unambiguously assigned to a subtype using this approach were subjected to further analysis, consisting of the performance of an additional amplification and sequencing reaction to fill the 111-bp gap between codon 99 of the PR gene and codon 36 of the RT gene. Using TruGene RT-PCR products as templates, PCR was performed using primers polfp2 (5'-GGTACAGTATTAGTAGGACCTACA-3') and polrp2 (5'-CCCACATCTAGTACTGTCACT-3'), generating a 415-bp amplicon spanning the 3' end of the PR gene and the 5' end of the RT gene. PCR products were purified, and sequencing was performed on an automated DNA sequencer (ABI model 3730; Applied Biosystems) using the ABI Prism Big Dye Terminator Cycle Sequencing kit v3.1 (Applied Biosystems). Contiguous 1,029-bp pol sequences were then assembled, and phylogenetic analysis was repeated as described above. For samples still yielding ambiguous results, the possibility that these sequences consisted of intersubtype mosaics was initially investigated using the recombination identification program (RIP v2.0; http://hivweb.lanl.gov/RIP/RIPsubmit.html) and then further analyzed by bootscanning using the SimPlot v2.5 software package (http://sray.med.som.jhmi.edu/SCRoftware).

    Subtype determination using web-based applications. The subtype of the 125 samples from African-born individuals was also determined by uploading FASTA-formatted files of TruGene sequence data to two internet-accessible, public domain computer programs, namely, HIV-SEQ (Stanford University, Palo Alto, Calif.; www.hivdb.stanford.edu), and Geno2Pheno (Genafor, Bonn, Germany; http://195.37.60.133/cgi-bin/geno2pheno.pl). Since HIV-SEQ independently analyzes RT and protease sequence data, use of this program resulted in the generation of a subtype for each of the genomic regions analyzed, whereas Geno2Pheno provided only a single subtype based on the best fit of the entire sequence to the reference sequences in its database.

    Nucleotide sequence accession numbers. GenBank accession numbers for the contiguous pol sequences determined in the study are DQ009495 to DQ009523.

    RESULTS

    Viral quantitation of non-B-subtype samples used for HIV-1 genotyping. The viral loads of all plasma samples used for genotyping that contained non-B-subtype HIV-1 strains were ascertained using either the Quantiplex bDNA v3.0 assay (Bayer Diagnostics, Tarrytown, NY) or the Roche COBAS Monitor v1.5 UltraSensitive RT-PCR assay (Roche Molecular Diagnostics, Indianapolis, Ind.), both of which have been demonstrated to provide accurate determinations of non-subtype-B HIV-1 RNA levels (12). Viral loads of plasma samples (n = 28) collected prior to August 2002 were determined using the bDNA assay and ranged from 2.70 to 5.74 log10 copies/ml with a mean viral load of 4.13 log10 copies/ml. The viral loads of the remaining samples (n = 93) were determined using the UltraSensitive RT-PCR assay and ranged from 2.20 to >5.0 log10 copies/ml. Thirty-two of these samples (34.4%) had viral loads in excess of the upper limit of quantitation of this assay (5.0 log10 copies/ml); the mean viral load of the remaining 61 samples was 3.96 log10 copies/ml.

    Performance of the TruGene HIV-1 genotype assay. Version 1.5 of the TruGene HIV genotyping assay successfully generated double-stranded sequence in both the protease and RT regions for all non-B viruses tested. This included nine samples with viral loads of <3.0 log10 copies/ml, the stated lower limit of sensitivity of the TruGene assay, that were subjected to an ultracentrifugation procedure prior to isolation of HIV RNA. The TruGene assay contains four sets of sequencing primers, two of which (PR and P2) cover a virtually identical region of the protease gene and two (RT beginning and RT middle) that between them cover codons 36 to 247 of the RT gene. No difference was observed in the performance of the RT sequencing primers when non-B and B-subtype viruses were compared (data not shown), with excellent double-stranded sequence being obtained in this region irrespective of the viral subtype. Similarly, excellent sequence data were obtained using the P2 protease primer pair; however, the PR protease primer pair performed relatively poorly with non-B-subtype sequences. For only 41/121 (33.9%) of the non-B-subtype sequences analyzed was the PR sequence of sufficient quality in both directions to be usable. The success rate of the PR primer pair was somewhat subtype specific, with the highest success rate being observed with viruses belonging to CRF02_AG (20/26; 76.9%) and subtype D (9/12; 75.0%). In contrast, only 12.5% (6/48) of subtype C viruses and 13.0% (3/23) of subtype A viruses yielded usable sequence with the PR primer set.

    Phylogenetic determination of HIV-1 subtype. Since the TruGene assay yields two noncontiguous sequences, one for the PR gene and one for the RT gene, phylogenetic analysis initially was performed independently for each of the genes. Using the criterion of a bootstrap value in excess of 70% (based on 100 replicates) to assign each unknown sequence to a viral subtype, 96/125 (76.8%) samples gave unambiguous subtyping results for both the PR and RT regions using the TruGene sequencing data (Table 1). Four of the patients in the initial cohort were determined by this analysis to be infected with subtype B viruses; thus, the success rate for non-B viruses using this approach was 76.0% (92/121). Ninety-one of the 92 non-B viruses whose subtype could be resolved using phylogenetic analysis of only the TruGene sequence data belonged to pure subtypes or CRFs, while a single sample contained an unequivocally chimeric virus that clustered with subtype C viruses in the protease region and with subtype D viruses in the RT region. Of the 29 viruses whose subtype could not be conclusively ascertained using purely TruGene data, 12 (41.4%) yielded ambiguous results for both gene regions, 10 (34.5%) could be assigned a subtype based on analysis of the RT but not the PR gene, and 7 (24.1%) could be assigned based on analysis of the PR but not the RT gene. The generation of additional RT sequence enabled phylogenetic analysis of a contiguous 1,029-bp pol fragment to be performed on the 29 problematic viruses (Table 1). Twenty-eight of these 29 samples could then be subtyped with an acceptable level of confidence, with 22 belonging to single subtypes or CRFs and six being chimeric in this region (three G/A, one A/D, one A/C, and one F/K). Representative examples of the bootscanning plots obtained with these intersubtype recombinants are shown in Fig. 1. The subtype composition of a single sample could not be conclusively determined, although it appeared to be most likely an A/C recombinant. The subtype designation of the remaining 120 non-B viruses in the sample cohort is listed in Table 1.

    Subtype designation using HIV-SEQ (Stanford). HIV-SEQ interrogates PR and RT sequence data independently; thus, two viral subtypes were generated for each TruGene sequence uploaded to the database. For 106 of the 120 samples (88.3%) in our cohort that were determined to be non-B and could be assigned to a subtype by phylogenetic analysis, the results of the HIV-SEQ analysis were entirely in concordance with the ultimate result of the phylogenetic analysis (Table 2). Of the 14 samples that were assigned incorrectly using HIV-SEQ, four (28.6%) were assigned incorrectly for the PR gene but correctly for the RT gene, three (21.4%) were assigned incorrectly for the RT gene but correctly for the PR gene, and seven (50.0%) were assigned incorrectly for both regions. The most common error was the designation of a viral sequence as belonging to CRF01_AE when the virus was subtype A. This type of error accounted for 56.5% (13/23) of miscalls in either gene (Table 2). The remaining misidentifications consisted of three failures to detect intersubtype chimeras (all of which were G/A), one misidentification of a subtype C virus as subtype K in the PR region (and therefore as a K/C chimera), one misidentification of a subtype G virus as CRF02_AG in the PR region, and one failure to identify a CRF (CRF06_cpx) in either region (Table 2). Although HIV-SEQ reports a percent similarity between the submitted sequence and the closest match in its database, we found no correlation between this value and the likelihood of the HIV-SEQ subtype determination being in agreement with the phylogenetic analysis (data not shown).

    Subtype designation using Geno2Pheno (Genafor). Subtype determination by the Geno2Pheno program was concordant with phylogenetic analysis for 106/120 (88.3%; Table 2) sequences in our non-B HIV-1 cohort. Not surprisingly, given that Geno2Pheno generated only a single "best-fit" subtype for each PR-RT sequence submitted, it misclassified all seven of the chimeric sequences in the cohort. Of the remaining seven errors made by Geno2Pheno, five were incorrect designations of pure subtype viruses as CRFs, one was an incorrect designation of a CRF as a pure subtype, and one was a misidentification of a subtype C virus as subtype K (Table 2). Interestingly, this was not the same subtype C virus that was misclassified as a K/C chimera by HIV-SEQ.

    Validation of a screening algorithm for identifying non-B-subtype sequences. Once the determination of forward and reverse PR and RT sequences has been completed in the TruGene assay, the OpenGene software program combines these sequences and compares them with an archetypal subtype B viral sequence, HIV-LAI (A04321). A final report is then generated which, in addition to listing significant mutations at known resistance codons, also contains a line listing of silent polymorphisms. Given that the comparator sequence is a subtype B virus, we reasoned that the number of silent polymorphisms listed should be considerably higher if the sample sequence belongs to a non-subtype-B virus. Table 3 shows the mean number and 95% confidence interval (CI) of silent polymorphisms for the 121 non-B viruses analyzed in this study and for 121 subtype B viruses. The subtype B cohort includes the viruses from the four patients originally included in our study cohort and 118 additional sequences retrieved from our archives. These patients were selected to match the non-B cohort with respect to viral load, CD4+ cell count, and exposure to antiretroviral drugs, and their subtype was confirmed as B by phylogenetic analysis of TruGene sequences. As expected, significantly more silent polymorphisms were detected in the non-subtype-B viral sequences than in the B sequences (Table 3). The mean number of silent polymorphisms in the PR and RT regions combined was 20.7 for subtype B viruses (95% CI, 9.9 to 31.5) versus 58.3 (95% CI, 41.1 to 75.5) for non-subtype-B viruses (Table 3). When the individual gene regions were analyzed independently, the mean number of silent polymorphisms was significantly higher in non-B than in B viruses for the RT region but not for the PR region (Table 3). We also examined the number of silent polymorphisms reported by subtype to establish that all the non-B subtypes detected in our cohort could be effectively differentiated from subtype B viruses using an algorithm based on this metric (Table 3). All of the subtypes identified in our non-B cohort could be differentiated from subtype B viruses, at the 95% confidence level, based on either the number of mutations in the RT sequence alone or that in the PR and RT regions combined (Table 3); however, neither subtype A nor subtype D viruses had significantly higher numbers of mutations in the PR region than subtype B viruses did (Table 3). Subtype D viruses also had significantly lower numbers of mutations in both the PR and RT regions than did any of the other non-B-subtype viruses identified in our cohort (Table 3).

    To determine an appropriate cutoff for screening TruGene assay results for non-B viral sequences, we determined the sensitivity and specificity of using various total numbers of silent mutations for this purpose (Table 4). A cutoff value of 30 or fewer total silent mutations appeared to result in an optimal combination of sensitivity (100%, with no non-B viruses having a total mutation count of <35) and specificity (96.8%), enabling detection of all non-B clade viruses with exclusion of all but four subtype B viruses in our cohort (Table 4). Although using a cutoff value of 35 or fewer mutations would have eliminated all subtype B viral sequences from further analysis, one subtype D virus in our cohort had 35 silent mutations, and a further six of the 12 subtype D viruses had silent mutation counts between 35 and 38, suggesting that the negative predictive value of such an aggressive cutoff value might be less than ideal in populations with higher frequencies of subtype D infection.

    DISCUSSION

    The results of this study demonstrate that the TruGene HIV-1 genotyping assay can be used to generate sequence data for determining antiretroviral resistance in phylogenetically diverse non-B-subtype viruses, even those with viral loads substantially below the claimed lower limit of the assay, and that the sequence information thus generated can be used to determine with a reasonable degree of accuracy the subtype of such viruses.

    Our finding that version 1.5 of the TruGene assay enables non-B HIV-1 subtype viruses to be sequenced with a high success rate is consistent with previous studies examining the performance characteristics of this version of the assay (5, 15, 20). The primary difference between versions 1.0 and 1.5 of the TruGene HIV-1 genotyping assay is a modification to the RT-PCR primers (20). This change resulted in a more efficient and robust RT-PCR amplification, which in turn improved the performance of the assay on both low-viral-load samples and samples containing non-B-subtype viruses (20). Our finding of a significant failure rate of the PR sequencing primer pair of the TruGene assay for most non-B subtypes is also in agreement with previous studies. In our sample cohort, only 34% of protease sequences could be elucidated using the PR primer pair, a result comparable to the 33% reported by Fontaine et al. (15) and 22% determined by Jagodzinski and coworkers (20). Interestingly, we obtained a significantly higher rate of success with the PR primers for determining the protease sequence of both subtype D (75%) and CRF02_AG (77%) viruses compared with other non-B viruses. The relatively small phylogenetic distance between subtypes B and D, and the resultant lower overall frequency of mismatches in both the RT-PCR and sequencing primer regions, is a logical explanation for the relative success of this primer pair with subtype D. Presumably, the absence of certain mismatches in the primer regions also accounts for the relative success of the PR primer pair with CRF02_AG viruses compared with viruses belonging to other subtypes whose overall sequence composition is equally divergent from subtype B. The ultimate importance of the PR primer sequencing failures was negligible, since excellent double-stranded sequence was typically obtained with the alternate protease sequencing primers (P2). Nevertheless, the observed subtype-specific variation in the performance of the PR primer pair highlights the need for continued surveillance of the performance of systems such as the TruGene assay on novel CRFs and unique viral recombinants as they are discovered to ensure that these assays continue to perform effectively on all group M HIV-1 strains.

    The need to perform epidemiologic surveillance of HIV-1 diversity in the United States has been recognized by the Centers for Disease Control and Prevention (37), although the precise mechanism by which such surveillance should or could take place remains to be determined. Given the relative rarity of non-B viruses in most populations of HIV-infected persons in the United States, conducting active surveillance for viral variants by collecting samples for sequence analysis on a representative sampling of newly diagnosed individuals would be a highly onerous, expensive, and largely unrevealing exercise. Emerging evidence of increasing rates of transmission of viruses harboring antiretroviral resistance mutations has resulted in the publication of more expansive guidelines for using genotypic resistance assays (19), however, making passive surveillance of viral diversity by screening genotyping data an increasingly viable option. The possibility of using the list of silent (synonymous) mutations documented on the TruGene report for differentiating B from non-B viral sequences was first intimated in the study by Jagodzinski and colleagues (20), although the relatively small sample size of their study precluded them from definitively establishing the validity of this technique. We analyzed the number of silent mutations reported by TruGene for each non-B virus in our cohort and compared this group with a matched group of subtype B viruses that had been sequenced in our laboratory (Table 4). The use of a simple cutoff value of 35 silent polymorphisms enabled detection of all non-B viral sequences while eliminating 117 of the 121 subtype B viral sequences (Table 4). These data convincingly establish that non-B viruses can be detected in a population that contains predominantly subtype B viruses without any expenditure of effort beyond that necessary to perform the TruGene assay for its intended clinical purpose of detecting antiretroviral resistance mutations. Screening TruGene assay reports, with samples whose analysis resulted in a listing of >30 silent polymorphisms being retained for further investigation, could provide a highly cost-effective means of monitoring the incidence of non-B viral subtypes in settings where genotyping assays are commonly performed at the time of HIV diagnosis.

    The primary purpose of this investigation was to assess the utility of the sequence data generated in the TruGene assay for determining viral subtype, either via conventional phylogenetic analysis or by using public-domain, internet-based software programs. Since the TruGene assay does not currently generate sequence for the entire RT-PCR amplicon, independent phylogenetic analyses were initially performed for the PR and RT sequences. Using comparatively stringent criteria for subtype classification (see Materials and Methods), only 92 of the 121 (76%) non-B-subtype viruses in our cohort could be definitively assigned to a given subtype or CRF. This is not surprising given the relatively short sequence lengths, 288 bp and 630 bp for the PR and RT genes, respectively, generated by the TruGene assay, and the high degree of sequence conservation in the pol region of HIV-1 (4, 31). The encouraging findings of a number of other investigators regarding the use of the pol region, rather than the more diverse and widely utilized env and gag regions, for subtype designation of HIV-1 viruses (4, 13, 18, 32) led us to believe that it was likely the limitation of having only two short, noncontiguous sequences for analysis that was responsible for our overall lack of initial success. To resolve this problem, we performed additional PCR amplifications on TruGene RT-PCR amplicons generated from those samples that could not be subtyped definitively using the TruGene sequence data alone, sequenced the products, and then generated contiguous PR-RT sequences encompassing 1,029 bp of the pol gene. Phylogenetic analysis of these sequences enabled us to conclusively identify the subtype of 28/29 (96.6%) viruses that had previously yielded ambiguous results, resulting in an extremely high overall success rate of 99.2% (120/121).

    Not surprisingly, viral sequences that proved problematic for subtype determination were not evenly distributed among the different subtypes identified in our study. Subtypes G and A were significantly overrepresented in the problematic viral group with two of three subtype G viruses (66%) and 11 of 23 (48%) subtype A viruses requiring additional sequencing for classification. This is not unexpected given that differentiation of subtype A and CRF01_AE, which has a subtype A-derived pol sequence, and to a lesser extent subtype G and CRF02_AG, which has a subtype G-derived pol sequence, can be problematic if only a limited amount of sequence information is available for analysis (31). Subtype C and D viruses, in contrast, were readily identified without the need for additional sequencing, with only 3 of 48 (6%) subtype C and 1 of 12 (8%) subtype D viruses yielding ambiguous results upon analysis of TruGene-generated sequence data.

    Among the viruses in our non-B cohort were seven that could be identified as intersubtype chimeras with recombination breakpoints within the PR-RT genes and one virus that was almost certainly chimeric, although the precise subtype composition of the PR-RT region of that virus could not be elucidated. Numerous previous studies have demonstrated that, in those parts of the world where multiple subtypes occur with some frequency, unique recombinant viruses are not uncommon and may even constitute a substantial minority of the total viral population (4, 10, 25, 38). The fact that a number of intersubtype recombinant viruses were detected in our unselected cohort further demonstrates the value of determining subtype composition within the PR-RT region of pol. Although more conventional env sequence-based subtyping analysis of HIV-1 is highly probative for differentiating pure subtypes and CRFs, the very sequence heterogeneity in this region that makes it valuable for phylogenetic differentiation makes it a relatively poor location for viral recombination (26, 35), and chimeric viruses are likely to remain undetected if only env sequences are analyzed. Sequence analysis of pol is essential, therefore, if a definitive determination is to be made of the significance of viral heterogeneity for antiretroviral response and antiretroviral resistance development.

    In addition to performing phylogenetic analysis on TruGene sequences, we also examined the ability of two public-domain, internet-accessible computer algorithms, namely, the HIV-SEQ program offered at the Stanford University HIV Drug Resistance Database and the Geno2Pheno program of Genafor, to assign subtypes to the non-B viruses in our cohort. Both algorithms performed identically overall, achieving an 88.3% concordance (106/120) with the phylogenetic determination. The nature of the erroneous calls made by the two programs was, however, somewhat different. Since the HIV-SEQ program analyzes the PR and RT gene sequences independently, it was reasonably successful in recognizing chimeric viruses with recombination breakpoints at, or close to, the PR-RT junction, with four of seven (57.1%) of such recombinants being correctly identified. The Geno2Pheno program, in contrast, determines a single "best fit" genotype for the entire sequence submitted and not surprisingly failed to identify a single chimeric sequence. Both algorithms made only a single complete misclassification of a pure subtype: Geno2Pheno identified a subtype C virus as subtype K while HIV-SEQ identified a different subtype C virus as a K/C chimera. The majority of the errors made by the HIV-SEQ program resulted in it identifying subtype A viruses as belonging to CRF01_AE (Table 2). It is not surprising that a relatively simple algorithm for designating subtype using pol sequence data such as that employed by HIV-SEQ should experience difficulty in differentiating these two subtypes; indeed, one could question whether attempting to identify CRFs in this manner should even be attempted. The only other miscalls made by the HIV-SEQ program were the failure to identify two chimeric viral sequences and one CRF (CRF06_cpx) that is not currently included in the program's database. Although the two computer algorithms performed comparably overall, our preference would be to use the HIV-SEQ program for TruGene sequences both because files generated by OpenGene can be uploaded directly to the website without manipulation and because the independent analysis of PR and RT sequences allows for the detection of at least some chimeric viral sequences.

    In conclusion, this study demonstrates that sequence information generated by the TruGene HIV-1 genotyping assay can be effectively used both to detect non-B viruses and to elucidate which subtype such viruses belong to. Certain limitations to this analysis exist at the moment, the primary one being the lack of a single contiguous sequence covering most, if not all, of the RT-PCR amplicon. This deficiency hampered conventional phylogenetic analysis of the samples in our cohort, necessitating the generation of additional sequence information for a considerable number of samples. The performance of the simple web-based algorithms as a surrogate for phylogenetic analysis was generally acceptable, although our results suggest that their identification of CRFs other than CRF02-AG should be routinely scrutinized by other means. Intriguingly, Gale and colleagues (17) recently described a rapid, automated, and potentially powerful computational approach for generating subtype information from pol sequences generated for antiretroviral resistance testing, to which they gave the acronym STAR. Perhaps the inclusion of algorithms for identifying non-B subtypes along with analysis tools similar to the STAR program in the software packages provided with applications such as the TruGene genotyping assay could help facilitate the important task of monitoring HIV-1 subtype movement and divergence using the increasingly available resource of sequence information generated for determination of antiretroviral resistance.

    REFERENCES

    Akinsete, O., T. Sides, D. Hirigoyen, C. P. Cartwright, C. Boraas, C. Davey, L. Pessoa-Brandao, and K. Henry. Unpublished data.

    Apetrei, C., D. Descamps, G. Collin, I. Loussert-Ajaka, F. Damond, M. Duca, F. Simon, and F. Brun-Vezinet. 1998. Human immunodeficiency virus type F reverse transcriptase sequence and drug susceptibility. J. Virol. 72:3534-3538.

    Barlow, K. L., I. D. Tatt, P. A. Cane, D. Pillay, and J. P. Clewley. 2001. Recombinant strains of HIV type 1 in the United Kingdom. AIDS Res. Hum. Retrovir. 17:467-474.

    Becker-Pergola, G., P. Kataaha, L. Johnston-Dow, S. Fung, J. B. Jackson, and S. H. Eshleman. 2000. Analysis of HIV type 1 protease and reverse transcriptase in antiretroviral drug-naive Ugandan adults. AIDS Res. Hum. Retrovir. 16:807-813.

    Beddows, S., S. Galpin, S. H. Kazmi, A. Ashraf, A. Johargy, A. J. Frater, N. White, R. Braganza, J. Clarke, M. McClure, and J. N. Weber. 2003. Performance of two commercially available sequence-based HIV-1 genotyping systems for the detection of drug resistance against HIV type 1 group M subtypes. J. Med. Virol. 70:337-342.

    Beerenwinkel, N., B. Schmidt, H. Walter, R. Kaiser, T. Lengauer, D. Hoffmann, K. Korn, and J. Selbig. 2002. Diversity and complexity of HIV-1 drug resistance: a bioinformatics approach to predicting phenotype from genotype. Proc. Natl. Acad. Sci. USA 99:8271-8276.

    Chaix, M. L., D. Descamps, M. Harzic, V. Schneider, C. Deveau, C. Tamalet, I. Pellegrin, J. Izopet, A. Ruffault, B. Masquelier, L. Meyer, C. Rouzioux, F. Brun-Vezinet, and D. Costagliola. 2003. Stable prevalence of genotypic drug resistance mutations but increase in non-B virus among patients with primary HIV-1 infection in France. AIDS 17:2635-2643.

    Cornelissen, M., R. van den Burg, F. Zorgdrager, V. Lukashov, and J. Goudsmit. 1997. pol gene diversity of five human immunodeficiency virus type 1 subtypes: evidence for naturally occurring mutations that contribute to drug resistance, limited recombination patterns, and common ancestry for subtypes B and D. J. Virol. 71:6348-6358.

    Descamps, D., C. Apetrei, G. Collin, F. Damond, F. Simon, and F. Brun-Vezinet. 1998. Naturally occurring decreased susceptibility of HIV-1 subtype G to protease inhibitors. AIDS 12:1109-1111.

    Dowling, W. E., B. Kim, C. J. Mason, K. M. Wasunna, U. Alam, L. Elson, D. L. Birx, M. L. Robb, F. E. McCutcheon, and J. K. Carr. 2002. Forty-one near full-length HIV-1 sequences from Kenya reveal an epidemic of subtype A and A-containing recombinants. AIDS 16:1809-1820.

    Dumans, A. T., M. A. Soares, E. S. Machado, S. Hue, R. M. Brindeiro, D. Pillay, and A. Tanuri. 2004. Synonymous genetic polymorphisms within Brazilian human immunodeficiency virus type 1 subtypes may influence mutational routes to drug resistance. J. Infect. Dis. 189:1232-1238.

    Elbeik, T., W. G. Alvord, R. Trichavaroj, M. de Souza, R. Dewar, A. Brown, D. Chernoff, N. L. Michael, P. Nassos, K. Hadley, and V. L. Ng. 2002. Comparative analysis of HIV-1 viral load assays on subtype quantification: Bayer Versant HIV-1 RNA 3.0 versus Roche Amplicor HIV-1 Monitor version 1.5. J. Acquir. Immune Defic. Syndr. 29:330-339.

    Eshelman, S. H., J. Hackett, P. Swanson, S. P. Cunningham, B. Drews, C. Brennan, S. G. Devare, L. Zekeng, L. Kaptue, and N. Marlowe. 2004. Performance of the Celera Diagnostics ViroSeq HIV-1 genotyping system for sequence-based analysis of diverse human immunodeficiency virus type 1 strains. J. Clin. Microbiol. 42:2711-2717.

    Fleck, F. 2004. WHO hopes 3-by-5 plan will reverse Africa's HIV/AIDS epidemic. Bull. W. H. O. 82:77-78.

    Fontaine, E., C. Riva, M. Peeters, J.-C. Schmit, E. Delaporte, K. van Laethem, K. van Vaerenbergh, J. Snoeck, E. van Wijngaerden, E. de Clercq, M. van Ranst, and A.-M. Vandamme. 2001. Evaluation of two commercial kits for the detection of genotypic drug resistance on a panel of HIV type 1 subtypes A through J. J. Acquir. Immune Defic. Syndr. 28:254-258.

    Fung, H. B., and Y. Guo. 2004. Enfuvirtide: a fusion inhibitor for the treatment of HIV infection. Clin. Ther. 26:352-378.

    Gale, C. V., R. Myers, R. S. Tedder, I. G. Williams, and P. Kellam. 2004. Development of a novel human immunodeficiency virus type 1 subtyping tool, subtype analyzer (STAR): analysis of subtype distribution in London. AIDS Res. Hum. Retrovir. 20:457-464.

    Gonzales, M. J., R. N. Machekano, and R. W. Shafer. 2001. Human immunodeficiency virus type 1 reverse-transcriptase and protease subtypes: classification, amino acid mutation patterns, and prevalence in a northern California clinic-based population. J. Infect. Dis. 184:998-1006.

    Hirsch, M. S., F. Brun-Vezinet, B. Clotet, B. Conway, D. R. Kuritzkes, R. T. D'Aquila, L. M. Demeter, S. M. Hammer, V. A. Johnson, C. Loveday, J. W. Mellors, D. M. Jacobsen, and D. D. Richman. 2003. Antiretroviral drug resistance testing in adults infected with human immunodeficiency virus type 1: 2003 recommendations of an International AIDS Society-USA panel. Clin. Infect. Dis. 37:113-128.

    Jagodzinski, L. L., J. D. Cooley, M. Weber, and N. L. Michael. 2003. Performance characteristics of human immunodeficiency virus type 1 (HIV-1) genotyping systems in sequence-based analysis of subtypes other than HIV-1 subtype B. J. Clin. Microbiol. 41:998-1003.

    Jorgensen, L. B., M. B. Christensen, J. Gerstoft, L. R. Mathiesen, N. Obel, C. Pedersen, H. Nielsen, and C. Nielsen. 2003. Prevalence of drug resistance mutations and non-B subtypes in newly diagnosed HIV-1 patients in Denmark. Scand. J. Infect. Dis. 35:800-807.

    Kantor, R., and D. Katzenstein. 2004. Drug resistance in non-subtype B HIV-1. J. Clin. Virol. 29:152-159.

    Kumar, S., K. Tamura, I. B. Jakobsen, and M. Nei. 2001. MEGA2: molecular evolutionary genetics analysis software. Bioinformatics 17:1244-1245.

    Kuritzkes, D. R., R. M. Grant, P. Feorino, M. Griswold, M. Hoover, R. Young, S. Day, R. M. Lloyd, C. Reid, G. F. Morgan, and D. L. Winslow. 2003. Performance characteristics of the TRUGENE HIV-1 genotyping kit and the Opengene DNA sequencing system. J. Clin. Microbiol. 41:1594-1599.

    Pandrea, I., D. L. Robertson, R. Onanga, F. Gao, M. Makuwa, P. Ngari, I. Bedjabaga, P. Roques, F. Simon, and C. Apetrei. 2002. Analysis of partial pol and env sequences indicates a high prevalence of HIV type 1 recombinant strains circulating in Gabon. AIDS Res. Hum. Retrovir. 18:1103-1116.

    Paraskevis, D., E. Magiorkinis, C. Anastassopoulou, M. Lazanas, G. Chrysos, A. M. Vandamme, and A. Hatzakis. 2001. Molecular characterization of a complex, recombinant human immunodeficiency virus type 1 (HIV-1) isolate (A/G/J/K/): evidence to support the existence of a novel HIV-1 subtype. J. Gen. Virol. 82:2509-2514.

    Parry, J. V., G. Murphy, K. L. Barlow, K. Lewis, P. A. Rogers, F. J. Belda, A. Nicoll, C. McGarrigle, S. Cliffe, P. P. Mortimer, and J. P. Clewley. 2001. National surveillance of HIV-1 subtypes for England and Wales: design, methods, and initial findings. J. Acquir. Immune Defic. Syndr. 26:381-388.

    Pearson, W. R., and D. J. Lipman. 1988. Improved tools for biological sequence comparison. Proc. Natl. Acad. Sci. USA 85:2444-2448.

    Perez-Alvarez, L., R. Carmona, M. Munoz, E. Delgado, M. M. Thomson, G. Contreras, J. D. Pedreira, R. Rodriguez Real, E. Vazquez de Parga, L. Medrano, J. A. Taboada, R. Najera, et al. 2003. High incidence of non-B and recombinant HIV-1 strains in newly diagnosed patients in Galicia, Spain: study of genotypic resistance. Antivir. Ther. 8:355-360.

    Quinones-Mateu, M. E., and E. J. Arts. 1999. Recombination in HIV-1: update and implications. AIDS Rev. 1:89-100.

    Robertson, D. L., J. P. Anderson, J. A. Bradac, J. K. Carr, B. Foley, R. K. Funkhouser, F. Gao, B. H. Hahn, M. L. Kalish, C. Kuiken, G. H. Learn, T. Leitner, F. McCutchan, S. Osmanov, M. Peeters, D. Pieniazek, M. Salminen, P. M. Sharp, S. Wolinksky, and B. Korber. 2000. HIV-1 nomenclature proposal. Science 288:55-56.

    Snoeck, J., K. Van Laethem, P. Hermans, E. Van Wijngaerden, I. Derdelinckx, Y. Schrooten, D. A. van de Vijver, S. De Wit, N. Clumeck, and A. M. Vandamme. 2004. Rising prevalence of HIV-1 non-B subtypes in Belgium: 1983-2001. J. Acquir. Immune Defic. Syndr. 35:279-285.

    Spira, S., M. A. Wainberg, H. Loemba, D. Turner, and B. G. Brenner. 2003. Impact of clade diversity on HIV-1 virulence, antiretroviral drug sensitivity and drug resistance. J. Antimicrob. Chemother. 51:229-240.

    Tatt, I. D., K. L. Barlow, A. Nicoll, and J. P. Clewley. 2001. The public health significance of HIV-1 subtypes. AIDS 15(Suppl. 5):S59-S71.

    Temin, H. M. 1993. Retrovirus variation and reverse transcription: abnormal strand transfers result in retrovirus genetic variation. Proc. Natl. Acad. Sci. USA 90:6900-6903.

    Thomson, M. M., L. Perez-Alvarez, and R. Najera. 2002. Molecular epidemiology of HIV-1 genetic forms and its significance for vaccine development and therapy. Lancet Infect. Dis. 2:461-471.

    Weidle, P. J., C. E. Ganea, K. L. Irwin, D. Pieniazek, J. P. McGowan, N. Olivo, A. Ramos, C. Schable, R. B. Lal, S. D. Holmberg, and J. A. Ernst. 2000. Presence of human immunodeficiency virus (HIV) type 1, group M, non-B subtypes, Bronx, New York: a sentinel site for monitoring HIV genetic diversity in the United States. J. Infect. Dis. 181:470-475.

    Yang, C., M. Li, Y. P. Shi, J. Winter, A. M. van Eijk, J. Ayisi, D. J. Hu, R. Steketee, B. L. Nahlen, and R. B. Lal. 2004. Genetic diversity and high proportion of intersubtype recombinants among HIV type 1-infected pregnant women in Kisumu, western Kenya. AIDS Res. Hum. Retrovir. 20:565-574.(Diane L. Hirigoyen and Ch)