当前位置: 首页 > 医学版 > 期刊论文 > 基础医学 > 分子生物学进展 > 2004年 > 第11期 > 正文
编号:11255310
Microsatellite Variation and Evolutionary History of PCDHX/Y Gene Pair Within the Xq21.3/Yp11.2 Hominid-Specific Homology Block
     * IPATIMUP, Instituto de Patologia e Imunologia Molecular da Universidade do Porto, Portugal; Unitat de Biologia Evolutiva, Facultat de Ciències de la Salut i de la Vida, Universitat Pompeu Fabra, Barcelona; and Faculdade de Ciências, Universidade do Porto, Portugal

    E-mail: alopes@ipatimup.pt.

    Abstract

    To better understand the evolutionary dynamics of repetitive sequences in human sex chromosomes, we have analyzed seven new X/Y homologous microsatellites located within PCDHX/Y, one of the two recently described gene pairs in the Xq21.3/Yp11.2 hominid-specific homology block, in samples from Portugal and Mozambique. Sharp differences were observed on X/Y allele distributions, concerning both the presence of private alleles and a different modal repeat length for X-linked and Y-linked markers, and this difference was statistically significant. Higher diversity was found in X-linked microsatellites than in their Y chromosome counterparts; when comparing populations, Mozambicans showed more allele diversity for the X chromosome, but the contrary was true for the Y chromosome microsatellites. Evolutionary patterns, relying on intragenic PCDHX/Y SNPs, also revealed distinct scenarios for X and Y chromosomes. Greater microsatellite diversity was displayed by African X chromosomes within the most common haplotypes shared by both populations, whereas higher microsatellite diversity was found in Portugal for the ancestral Y chromosome haplotype. The most frequent PCDHY haplotype in Portuguese was the derived one, and it was not found in Mozambicans. TMRCA estimated by the parameter resulted in 13,700 years (7,500–20,000 years), which is consistent with a recent, post–Out-of-Africa origin for this haplotype. In conclusion, the newly described microsatellite loci generally displayed greater X-linked to Y-linked diversity and this pattern was also detected with slower evolving markers, with a remarkable differentiation between populations observed for Y chromosome haplotypes and, thus, greater divergence among Y chromosomes in human populations.

    Key Words: Microsatellite variation ? sex chromosomes ? ProtocadherinX/Y ? Homo sapiens ? evolution

    Introduction

    The study of patterns of variation in nonrecombining homologous regions of the sex chromosomes provides the unique opportunity of approaching the mechanisms underlying the generation and shaping of diversity in two DNA sequences evolving in different genomic environments. The existence of particular genomic environments in these regions is due to both the differentiation experienced by these chromosomes during evolution, outside the pairing regions (Graves 1995), and to the fact that males and females bear different sex chromosome complements. This implies that, within these regions, recombination events are restricted to the X chromosome on females and that sequences located on the X and Y chromosomes are subject to sex-specific dynamics. Moreover, because of a 3:1 ratio of X/Y chromosome population effective sizes, sex chromosomes are exposed to genetic drift to different extents (Jobling and Tyler-Smith 2003).

    Until now, few X/Y homologous microsatellite markers have been described (Scozzari et al. 1997; Karafet et al. 1998; Carvalho-Silva and Pena 2000; Dupuy, Gedde-Dahl, and Olaisen 2000; Kersting et al. 2001). Moreover, the markers analyzed were not always the most convenient to perform direct X-Y comparisons because of limitations concerning either the different complexity of repeat structure between sex chromosomes (Carvalho-Silva et al. 1999; Dupuy, Gedde-Dahl, and Olaisen. 2000) or the assignment of alleles to the respective sex chromosome by male/female frequency-based inferences (Karafet et al. 1998; Kersting et al. 2001), factors that may have blurred the patterns of diversity.

    In the present study, we analyzed, in a chromosome-specific manner, seven new X/Y homologous microsatellites located within PCDHX/Y, one of the two recently described genes in the Xq21.3/Yp11.2 hominid-specific homology block (Blanco et al. 2000; Blanco-Arias, Sargent, and Affara 2002). Divergence between X and Y sequences in these regions began 3 to 4 MYA, after the translocation that isolated one of the blocks on the nonrecombining region of the Y chromosome (NRY) (Schwartz et al. 1998). Maintenance of high X-Y sequence identity (99%) (Skaletsky et al. 2003) makes this region an optimal model for the comparisons we aimed to perform, because we can in this way minimize the confounding effects of extensive divergence between X-Y homologs. Both diversity comparisons and chromosome-specific patterns of evolution in these regions in human populations are particularly interesting, because they belong to the restricted group of rearranged chromosomic segments that could have been involved in human speciation (Navarro and Barton 2003; Rieseberg and Livingstone 2003).

    To perform these comparisons we have analyzed samples from two available and well-characterized populations, Mozambique (African) and Portugal (European). In each of these populations, a different and highly frequent Y lineage can be found as a result of specific demographic events. These events include, in Europe, the expansion from isolated population nuclei in refuges after the Last Glacial Maximum, which occurred 20,000 to 13,000 years ago and, in Africa, the more recent Bantu agricultural expansion, in the past 3,000 years. These populations would nevertheless guarantee the inclusion of ancient as well as younger Y chromosome lineages in this study (Pereira et al. 2000, 2002).

    To further clarify the evolutionary forces acting on these homologous regions in the framework of the relationships between the two human populations under study and to gain in this way some perspective on global an population-specific X-Y chromosome comparisons, we proceeded to the analysis of haplotypes defined by the microsatellite markers presently described and anchored the microsatellite information on previously reported binary markers within PCDHX/Y genes (Giouzeli et al. 2004).

    Materials and Methods

    Sampling

    Blood samples were collected from 112 healthy unrelated males from one European population (Portugal) and one African population (Mozambique). Mozambique sampling was performed in Maputo (capital of Mozambique) and included only the prevailing dialects or languages of the Bantu group. Genomic DNA was extracted using a Chelex resin method (Lareu et al. 1994) from humans. A Pan troglodytes male sample was provided by Dr. Angel Carracedo from the Institute of Legal Medicine, University of Santiago de Compostela (Spain) and one Gorilla gorilla male sample was provided by Patricia Blanco from the Department of Pathology, University of Cambridge (UK).

    Microsatellite Detection and Selection

    Search for new microsatellites within the PCDH genes was performed in silico in Homo sapiens chromosome X and Y genomic contigs (GeneBank accession numbers NT_011651 and NT_011896, respectively) using Tandem Repeat Finder (Benson 1999) available on (http://tandem.biomath.mssm.edu/trf/trf.html) in a region comprising PCDHX/Y genes and approximately 10 kb of flanking sequences. Potential polymorphic repeats were selected throughout the gene by X/Y homology in flanking regions and maximum (always greater than eight) number of repeated motifs on both sex chromosomes (fig. 1).

    FIG. 1.— Relative positions of the polymorphisms analyzed, within PCDHX/Y genes. Bars represent previously described exons: dark gray bars, coding exons; bars enclosed with a broken line, exons absent in PCDHY. X/Y microsatellites are represented inside circles. Binary markers are above (PCDHY) and below (PCDHX) the scheme. Maximum distance between PCDHX markers is approximately 700 kb.

    PCR Amplification

    The PCR amplification was performed with chromosome-specific primers, taking advantage of single base sequence differences between the X and Y chromosomes in the flanking regions of each marker of interest. Typically 15 ng of genomic DNA were amplified in a 12.5 μl reaction volume comprising 1.5 mM MgCl2, 20 mM ammonium sulfate, 1U Taq DNA polymerase (Promega), 200 μM of each DNTP, and 0.3 μM of each primer (see Supplementary Material online). PCR conditions were the following: 5 min preincubation step at 94°C, 32 to 35 cycles of denaturation at 94°C for 30 sec, annealing for 30 sec at the respective AT for each primer pair, and extension at 72°C for 30 sec, followed by a final extension step at 72°C for 10 min.

    Microsatellite Typing

    For microsatellite typing, forward or reverse primers were fluorescently labeled, allowing the distinction of overlapping fragments from different chromosomes. Separation and fragment size analysis of PCR products were performed in an ABI 310 sequencer using the Genescan version 2.1 analysis software.

    Single Nucleotide Polymorphisms (SNP) Typing

    Amplification products were submitted to restriction endonuclease digestion with the enzymes listed in table 1, according to the manufacturer's instructions (MBI Fermentas). The digested products were then electrophoresed on horizontal polyacrylamide gels (T9C5) and visualized with silver staining (Budowle et al. 1991). Presence of either of the alleles in each DNA molecule produced different number and/or sizes of bands in the gel. The intronic TC polymorphism in the X chromosome found while typing (GT)n 3 was assessed directly by allele-specific PCR performed to differentiate X/Y products, because the X chromosome presented either the same nucleotide in that position as the one present on the Y chromosome (C) or a T. Different primers were then designed to accomplish X-Y specificity for (GT)n 3 microsatellite typing. Three X chromosome's binary markers a few base pairs apart, in PCDHX exon 7 (GA) and intron 7 (TG) and (CA), were typed by sequencing.

    Table 1 Allele Frequency Distributions of PCDHX/Y Binary Markers in Portugal and Mozambique

    Sequencing

    Extensive sequencing was performed on a total of 67 male individuals (56 Portuguese and 11 Mozambican), one Pan troglodytes and one Gorilla gorilla male sample, in 528 bp of a noncoding region of PCDHX/Y. Further sequencing was conducted on selected human samples of virtually all allelic states to confirm RFLP results and to verify repeat structure and assign the respective number of repeats to each fragment size detected in the ABI 310 sequencer. Sequencing of one male Pan troglodytes sample for X chromosome microsatellites was also performed. As a standard procedure only one DNA strand was sequenced. PCR fragments were purified with Microspin S-300 HR columns (Pharmacia). A dideoxy cycle sequencing reaction was carried out using the Big Dye Terminator Cycle Kit (PE Applied Biosystems). The products were purified using SigmaSpin Post-Reaction Clean-Up columns (Sigma) and run in an ABI 3100 sequencer (PE Applied Biosystems). The results were analyzed using the Data Collection software.

    Data Analysis

    Allelic and haplotype frequencies were estimated by direct allele/haplotype counting because only males were sampled. Gene diversity across loci for all populations, population comparisons (differentiation values), and average pairwise differences within populations were calculated with the Arlequin version 2.000 software (Schneider, Roessli, and Excoffier 2000).

    Nonparametric tests were used to assay X/Y equality of (1) allelic microsatellite distributions (Kolmogorov-Smirnov) and (2) gene diversity values across microsatellite loci (Mann-Whitney).

    Phylogenetic analysis of both binary and microsatellite variation were performed in Network version 4.0, applying the Reduced-Median (Bandelt et al. 1995) and Median-Joining (Bandelt, Forster, and Rohl 1999) network methods sequentially to resolve extensive reticulation at microsatellite loci (Qamar et al. 2002). In all calculations, was set to zero and differential microsatellite weighting was applied to achieve most parsimonious networks. Weights for each microsatellite were always inversely proportional to the ratio of the variance displayed by each marker in the respective population and the average variance value across loci in that population, within a fourfold range. A null weight was attributed to (AC)n repeat, considering that this was the only Y microsatellite displaying repeat length frequency distributions that deviate from the Gaussian-like shape expected under the stepwise mutation model. Binary markers' weights were always set to the maximum value (99).

    Relative age for the Y chromosome haplotype carrying the derived allelic state at both SNPs analyzed was estimated by a phylogenetic approach, using the statistic, which represents the average number of mutational changes between a set of selected haplotypes and the ancestral one (Forster et al. 1996). The ancestral node was defined as the most frequent one that, although presenting the derived SNP haplotype, displayed a STR haplotype shared between ancestral and derived chromosomes. Because STR mutation rates were essential to obtain absolute time estimates, and considering the scarcity of this kind of data on Y chromosome dinucleotide repeats, autosomal STR variation at corresponding loci and the respective effective mutation rate estimate (w) of 1.52 x 10–3 (Zhivotovsky et al. 2000) was used to perform variance comparisons that would allow obtaining mutation rate estimates specific for the presently described Y markers, as suggested by Zhivotovsky et al. (2004). Values for the effective mutation rate were then obtained for each Y locus, using the respective repeat length variance within the population used to perform relative age estimates. One absolute time estimate was then generated, multiplying the average effective mutation rate across Y loci (6.0 x 10–4) by and a generation time of 25 years.

    Results

    Molecular Characterization of the X/Y Markers

    All the markers analyzed revealed a simple (CA)n, (GT)n, or (AC)n repeat structure in both sex chromosomes in humans and in the X chromosome of chimpanzee, differing only in the number of repetitive units and, in some cases, in point mutations and/or small insertions/deletions in the flanking regions of the repeats (see Supplementary Material online). In the specific case of the (GT)n 2 repeat we detected some degree of variation in a Poly (T) stretch and a (GTT)n trinucleotide in the immediate 5' sequence on the X chromosome but not on the Y chromosome (fig. 2). Despite the physical proximity between these variable units, we decided to take them into account individually for further analysis because, attending to their heterogeneity, it seemed more consistent to consider the existence of three different repeat motifs, rather than a complex repeat on the X chromosome. Nevertheless more caution is needed in interpreting X-Y comparisons concerning these microsatellites, because there could be some mutual influence on the evolutionary dynamics of these three repetitive sequences that would differ on the X and Y chromosomes. While analyzing variation in (GT)n 3 dinucleotide on the X chromosome, one polymorphic TC substitution, 190 bp downstream of the repeat, was found (ss23133234).

    FIG. 2.— Allele frequency distributions of X/Y microsatellites in male individuals from Portugal (n = 112) and Mozambique (n = 112). V = variance in repeat number, GD = gene diversity; white bars X chromosome, black bars Y chromosome.

    Microsatellite Markers

    Comparison of Homologous Allelic Distributions

    The most striking feature of all X-Y homologous markers in both of the populations sampled was the different allelic distributions displayed, according to the chromosome of origin of the microsatellite, concerning both the presence of private alleles and a different modal repeat length for X-linked and Y-linked markers (fig. 2), as reported for other markers (Karafet et al. 1998). These differences were reflected in a statistically significant value in a two-sample Kolmogorov-Smirnov test (P < 0.001 in all X-Y pairs).

    Comparisons of gene diversity between sex chromosomes, although with some variation across loci, has shown an overall trend to reduced levels of intrapopulation gene diversity for Y chromosome versus X chromosome microsatellites in both European and African populations, consistent with the results obtained for another Y-linked marker (Karafet et al. 1998) but revealing the opposite X-Y diversity patterns observed for other markers (Scozzari et al. 1997). The most extreme difference between X and Y dinucleotide microsatellite distributions was found at (GT)n 1, Poly (T) stretch, and (GTT)n trinucleotide repeats, where the Y repeat number was constant, but the respective X counterpart was moderate to highly polymorphic in both populations analyzed (fig. 2). However, the opposite trend was observed at (AC)n in the Portuguese population. This marker is also exceptional in the fact that the alleles found in the chimpanzee sample for (AC)n and also for the Poly (T) stretch are within the range detected in the human Y chromosome sample, whereas for the other microsatellites analyzed, the primate alleles fall in X-specific regions of the human distribution (see Supplementary Material online).

    Overall, the X-linked markers displayed in both populations higher diversity values when compared with their Y counterparts, and this difference was statistically significant (Mann-Whitney: P = 0.025 in Portugal; P = 0.002 in Mozambique). This greater variability is also evident in the higher average repeat length variance values obtained for the X-linked markers (2.39 versus 0.63 in Portugal; 4.45 versus 0.16 in Mozambique).

    Average among-population diversity values revealed contrasting patterns for X-linked and Y-linked markers, being higher in the African population when compared with the European one if X chromosome microsatellites are considered (1.21 ratio) but pointing to a greater European diversity for Y-linked markers (1.44 ratio).

    Plotting within-population variance values for X versus Y homologous microsatellites, we observed that the samples clustered in two groups distinguished mainly by very high or moderately low values of variance for the X chromosome repeats and extremely low values for Y chromosome repeats, with only (AC)n standing out (fig. 3). It is also interesting to notice that we find the samples distributed throughout all the extremes of the quadrants in the graph, except the one representing high values of variance for both X and Y chromosome microsatellites.

    FIG. 3.— Plots of within-population variances for X/Y homologous microsatellites in Portugal and Mozambique.

    Haplotype Analysis

    For direct comparisons of X-Y microsatellite haplotype diversity, all markers were considered for both chromosomes. Concordantly with the results obtained in the locus-by-locus analysis, the X chromosome presented higher haplotype diversity than the Y chromosome in both populations, this effect being more pronounced in the African population (0.995 versus 0.560 in Mozambique; 0.987 versus 0.746 in Portugal). This reduced variation observed for the Y chromosome is reflected in lower diversity values for the African population when compared with the European one for haplotypes of Y-linked markers, whereas the opposite is observed for haplotypes defined by X-linked markers, as it has been described for other markers in the literature (Scozzari et al. 1997).

    No X-Y shared haplotypes were observed, consistent with our previous findings of mostly chromosome-private alleles for individual markers. Microsatellite haplotypes were grouped according to population and chromosome and compared with each other by mean of exact tests of population differentiation. All comparisons were statistically significant (P < 0.001).

    Binary Markers

    Allele Frequencies

    The allele frequencies of two binary markers for the Y chromosome and five for its X counterpart were determined (table 1). All of these markers were previously described (Giouzeli et al. 2004), except (TC) substitution in intron 6, which was found during the course of this study and are included in NCBI database. Determination of the nucleotide present in the orthologous position in chimpanzee in the last marker allowed us to define the ancestral allele.

    The X-linked SNP found by us was moderately polymorphic, with the derived allele (C) present at a frequency of approximately 20% in both populations surveyed. As for the other markers, the frequencies found in the Portuguese population were quite similar to the ones reported in the previously mentioned study, as expected because the vast majority of samples in both cases belonged to populations with European ancestry, while some differences in allele frequencies were observed in Mozambique (table 1).

    For the Y chromosome, the two previously described allelic combinations of exon 5 base substitutions, T and G or G and T, were found only in the Portuguese population, in a proportion of 43% to 57%, respectively. In Mozambique, only T-G bearing Y chromosomes, corresponding both to X-like and, thus, to ancestral allelic states, were observed.

    Evolutionary Patterns

    The X-linked binary markers segregated as four haplotypes in Mozambique, whereas in Portugal, three additional haplotypes were found (table 2). Because the ancestral alleles at all binary positions were known, evolutionary relationships between all molecular variants found were predicted (fig. 4A). Two mutation events in the ancestral background need to be considered to embrace all haplotypes found, although recombination events could also be involved.

    Table 2 Observed Haplotype Distribution at PCDHX/Y Genes in Portugal and Mozambique

    FIG. 4.— Median-joining networks of X chromosome (A, B, and C) and Y chromosome (D) haplotypes. Circle area is proportional to frequency and branch length is proportional to the number of STR mutations. Different circle fillings symbolize different SNP haplotypes. (A) X-linked haplotypes defined by binary markers in both populations analyzed, where H1 is the ancestral. (B and C) X chromosome haplotypes inferred from variation at seven microsatellites in Mozambique (B) and Portugal (C). (D) Y chromosome haplotypes defined by five microsatellites in Portugal. The arrow indicates the ancestral node used for estimate and derived haplotypes, considered for this estimation, are in black and connected by solid lines.

    Adding the microsatellite variation data gathered on the X chromosome to each haplotypic background, Median-joining networks generated gave us an insight on the relative diversity of each molecular variant, revealing a greater diversity for the ancestral haplotype (H1) in both populations, when compared with the derived ones (fig. 4B and C). Considering the geographic distribution and frequencies displayed in each population by the X-linked haplotypes, it is plausible to assume the existence of three older haplotypes (H1, H3,and H4), present both in Africa and Europe. Hints for African ancestry of these shared haplotypes are a higher microsatellite haplotype diversity within Mozambique compared with Portugal of H1 (0.993 versus 0.982) and H3 (0.972 in Mozambique; in Portugal H3 diversity is 1.00 but only two chromosomes can be found). H4 displays similar diversity values in both populations (0.984 in Portugal and 0.980 in Mozambique), but a higher number of average pairwise differences within Mozambican chromosomes (12.027 versus 7.676) reveals the presence of more diverged and, thus, older haplotypes in Africa. The fourth haplotype found to be shared by both populations (H2) is present in very low frequencies (3% in Mozambique and 5% in Portugal) and exhibits equal microsatellite diversity in Portugal and in Mozambique, being more difficult to infer its antiquity. The remaining haplotypes (H5, H6, and H7) are European-specific and have, most probably, a relatively recent origin, taking into account both their peripheral distribution in the network and the low frequencies at which they are found.

    The most frequent PCDHY haplotype in Portuguese, namely H2, was not found in Mozambicans. This both hints at a recent age and allows estimating it by a simple method, which need not take into account population substructure or the action of recombination (as in the X chromosome). The amount of diversity accumulated in the STRs by mutation from a putative ancestral haplotype (which was found in over 80% of the H2 chromosomes [see fig. 4D]) was measured by means of the parameter, which translates linearly into time. The time to most recent common ancestor (TMRCA) of H2 was, thus, estimated as 13,700 years (7,500 to 20,000 years), which is consistent with a recent post–Out-of-Africa origin for this haplotype.

    Sequence Comparisons with Primates

    Given that some of the polymorphisms analyzed are nonsynonymous substitutions (table 1), we decided to include in this study sequence comparisons in 528 bp of noncoding region, in humans and nonhuman primates.

    Five mutations occurred in this region in the human lineage, three of which became fixed as X-Y differences (see Supplementary Material online). From the latter, two are undoubtedly attributable to the Y chromosome and one to the X chromosome, whereas in the remaining two X-Y differences, we found variability within human X chromosomes, presenting in those positions either the Y characteristic allele (T) or an alternative nucleotide (C, A). If we consider that in the orthologous positions of both chimpanzee and gorilla X chromosomes we found the same alleles as in the human Y chromosome, then we must account for two more mutations occurring in the human X chromosome, which have not reached fixation yet. The evenness in changes accumulated in either human sex chromosome is in conformity with previous estimates in the gene free Xq-Yp homology region (Bohossian, Skaletsky, and Page 2000).

    Ancient polymorphism in the X chromosome should also be considered as a possible source of variation that could have become fixed on the Y chromosome after the translocation (Makova and Li 2002), reducing the number of mutations that should be ascertained to the human Y chromosome, although none of the differences observed between human Y chromosome and chimpanzee X was found to be polymorphic in present human X chromosomes.

    Nucleotide diversity () for human X sequences (7.3 x 10–4) was twice that previously described for a region of low recombination (Kaessmann et al. 1999) but in agreement with other values described for the X chromosome (Jaruzelska, Zietkiewicz, and Labuda 1999) and for autosomes (Zhao et al. 2000). As for the Y chromosome, no variation was found in human sequences, except a 10-bp deletion in one individual, which was not accounted for in these calculations, in accordance with previously described reduced diversity levels in this chromosome (Dorit, Akashi, and Gilbert 1995; Hammer 1995; Underhill et al. 1996).

    Discussion

    We have analyzed seven new X/Y homologous microsatellites in samples from Portugal and Mozambique and found higher diversity in the X chromosome microsatellites than in their Y chromosome counterparts; when comparing populations, Mozambicans showed more allele diversity for the X chromosome, but the contrary was true for the Y chromosome microsatellites.

    In a previous study, Scozzari et al. (1997) found the same patterns for X/Y microsatellites when comparing African to European internal diversity but, overall, found more diversity in the two Y than in the X microsatellites. This may be the product of random deviation from an expected model of greater diversity on the X than on the Y chromosome, as discussed below. The X/Y homologous loci studied by Karafet et al. (1998) matched our overall pattern of greater X than Y diversity but showed more diversity for Africans than for Europeans.

    The inconsistency of X and Y microsatellites in attributing more diversity to African versus European populations, as would be expected under the Out-of-Africa model and as observed for other microsatellite loci and mtDNA (Vigilant et al. 1991; Bowcock et al. 1994; Jorde et al. 1995) deserves further investigation as it cannot be resolved by the present study, given the restricted distribution of the African samples analyzed. The reduced diversity observed by us for Y-linked markers in Africa is most probably related to the exclusively Bantu-speaking origin of the individuals sampled, which have been shown to exhibit lower levels of diversity than Europeans (Lucotte et al. 1994; Pereira et al. 2002).

    A number of different factors may account for the larger diversity observed by us on the X chromosome compared with the Y chromosome. First, founding Y chromosome alleles translocated from the X chromosome may have been on the short end of the X microsatellite distribution, leading to the stabilization or even loss of polymorphism on the Y locus (Carvalho-Silva et al. 1999). However, for some loci, the Y microsatellite displayed reduced or absent variability, although presenting allele lengths only one step apart or even in the same allele length range as the X alleles.

    Second, the X and Y homologous regions differ in their genome dynamics: recombination operates on the X chromosome but not on Yp11.2. Recombination hot spots have been reported to associate with long (GT)n repeats (Majewski and Ott 2000). Unequal crossing-over might explain the bimodal, non-Gaussian allele length distribution at the (GT)n 1 locus (which contrasts starkly with its monomorphic Y counterpart). However, it is unlikely that recombination contributes much allele diversity as compared with replication slippage, given that the empirically determined mutation rates for Y chromosome and trinucleotide and tetranucleotide autosomal microsatellites are very similar (Kayser et al. 2000; Zhivotovsky et al. 2004), even though these pending observations are to be confirmed with dinucleotide repeats. Sequence diversity is generated in the ampliconic regions of the Y chromosome by gene conversion (Skaletsky et al. 2003; Bosch et al. 2004). PCDHY is, however, a single-copy gene outside extensive repetitive regions, and no such phenomenon has been reported in this gene, despite the extensive sequence survey by DHPLC undertaken by Giouzeli et al. (2004).

    Third, different male/female demographic and reproductive patterns may be the main cause of the difference in genetic diversity among the two sex chromosomes. The effective population size of the Y chromosome would be one-third that of the X chromosome if both sexes contributed equally to the next generation. Reproductive habits conferring higher reproductive variance for males, such as polygyny, can further reduce Y chromosome Ne (Underhill et al. 1996; Charlesworth 2001; Dupanloup et al. 2003), leading to a reduction of polymorphism caused by genetic drift. In fact, the pronounced reduction of Y-linked microsatellite diversity when compared with the X chromosome in Mozambique is most probably explained by a model of asymmetric gene flow of paternal and maternal lineages between food-producer populations (Bantu-speakers) and hunter-gatherers within Africa, jointly with different levels of poliginy and patrilocality in each population, as proposed by Destro-Bisol et al. (2004) and supported by other data (Pereira et al. 2001, 2002).

    A recent selective sweep may have also contributed to a general reduction of variation in the Y chromosome, although there is little direct evidence for such an event (Pérez-Lezaun et al. 1997; Thomson et al. 2000). Our sequence data, even though not very powerful, show that although substitutions have accumulated evenly between the X and Y chromosome, nucleotide diversities found are extremely lower for the Y chromosome, reinforcing the view of a reduction of diversity on the Y chromosome.

    Evolutionary patterns relying on intragenic PCDHX/Y SNPs also revealed distinct scenarios for X and Y chromosomes. Although a higher number of X and Y haplotypes were found in Europe compared with Africa, reflecting the ascertainment in European populations of all polymorphisms analyzed, greater microsatellite diversity was displayed by African X chromosomes within the three most common haplotypes shared by both populations, whereas for the widespread ancestral Y chromosome haplotype, higher microsatellite diversity was found in Portugal—although both populations presented the same number of haplotypes, 65% of Mozambican Y chromosomes shared the same STR haplotype. As the T-G ancestral haplotype (H1) must be older in Africa, an elimination of STR lineages within this Y chromosome group in Mozambique must have taken place, most probably by the Bantu migration and expansion along the Eastern coast of Africa (Salas et al. 2002).

    The biased ascertainment of the binary markers under consideration must also be evoked to justify the lack of putative recombinant X chromosome haplotypes in the older Mozambican population, where lower levels of LD are expected (Reich et al. 2001). SNPs with low minor allele frequencies tend to present higher LD (Ke et al. 2004); ascertainment bias implies that SNPs tend to be less polymorphic in populations other than those where the SNPs were ascertained, thus increasing LD. However, from microsatellite data, it is possible to infer an African ancestry for X chromosome SNP-defined haplotypes shared by both populations, because of higher diversity within Mozambique for these haplotypes, results which are in accordance with an early population bottleneck scenario shaping European diversity, as expected under the Out-of-Africa model. The frequency distribution of X chromosome haplotypes based on binary markers alone is suggestive of a population expansion within Europe with two frequent haplotypes in Portugal followed by a tail of rare ones.

    The asymmetric distribution of PCDHY haplotypes between Portugal and Mozambique seems justifiable by a recent origin of the derived one (H2), approximately 7,500 to 20,000 years ago as suggested by the Y STR haplotype phylogeny. This estimate partially overlaps with the Last Glacial Maximum, which occurred in Europe 20,000 to 13,000 years ago and points to a more recent origin for this haplotype, compared with the M173 mutation (30,000) (Semino et al. 2000) that defines most of the Y lineages within clade R, highly frequent in Western Europe. The ancestral PCDHY haplotype (H1) must be found in several Y chromosome lineages because the Mozambican Y chromosome pool, which is characterized by its exclusive presence, comprises lineages belonging to some of the older African Y chromosome clades as well as more recent and widespread ones (Pereira et al. 2002). Thus, the considerable difference in frequency of the two observed PCDHY haplotypes is not surprising in view of their different antiquity and bearing in mind that genetic drift is particularly powerful in shaping diversity on the NRY, accelerating differentiation between populations. These two haplotypes, which are different at both SNPs, are a further example of "yin yang haplotypes" in the human genome (Zhang et al. 2003). The intervening haplotypes were not detected in the present survey, although we cannot rule out their existence in other populations, namely in Asia.

    Conclusion

    The newly described microsatellite loci generally displayed greater X-linked than Y-linked diversity, and this pattern was also detected with slower evolving markers, with a remarkable differentiation between populations observed for Y chromosome haplotypes. Overall, information drawn from Yp11.2 polymorphisms is in accordance with the diversity patterns observed for other Y-linked loci, which is expected because of the absolute linkage that characterizes the NRY. The analyzed binary markers may also be enlightening in refining the Y chromosome world phylogeny, namely distinguishing lineages within clade R.

    Its parental sequence in Xq21.3 records the major signatures of African and European population movements from both maternal and paternal point of view, and diversity patterns in this locus seem to be in agreement with the long-standing view of European diversity representing a subset of the ancestral gene pool within Africa, although more informative markers are needed for a detailed picture.

    Differences observed for the presently described X/Y microsatellites are, thus, expected to reflect both chromosome-specific mutation rates and historical contexts of the populations surveyed, within the complex framework of male/female specificities, which cannot be disentangled in an obvious way. Empirical determination of mutation rates for these markers on both sex chromosomes would be of great interest to quantify the relative contribution of each chromosome in creating new alleles.

    Supplementary Material

    Details on primers used in this study, as well as all sequence alignments are supplied in the MBE Web site as Supplementary Material online.

    Acknowledgements

    The authors would like to thank Maria Giouzeli, from the Department of Psychiatry of the University of Oxford, for providing all information available on the SNPs within PCDHX/Y and for her help with primer design. We thank Dr. Albertino Damasceno and Dr. Benilde Soares of the Eduardo Mondlane University (Maputo) for kindly providing the Mozambican samples. We also would like to acknowledge Patricia Blanco for her scientific advice. This work was partially supported by Funda??o para a Ciência e Tecnologia (through grant SFRH/BD/7006/2001 and POCTI, Programa Operacional Ciência, Tecnologia e Inova??o).

    References

    Bandelt, H. J., P. Forster, and A. Rohl. 1999. Median-joining networks for inferring intraspecific phylogenies. Mol. Biol. Evol. 16:37–48.

    Bandelt, H. J., P. Forster, B. C. Sykes, and M. B. Richards. 1995. Mitochondrial portraits of human populations using median networks. Genetics 141:743–753.

    Benson, G. 1999. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27:573–580.

    Blanco, P., C. A. Sargent, C. A. Boucher, M. Mitchell, and N. A. Affara. 2000. Conservation of PCDHX in mammals: expression of human X/Y genes predominantly in brain. Mamm. Genome 11:906–914.

    Blanco-Arias, P., C. A. Sargent, and N. A. Affara. 2002. The human-specific Yp11.2/Xq21.3 homology block encodes a potentially functional testis-specific TGIF-like retroposon. Mamm. Genome 13:463–468.

    Bohossian H. B., H. Skaletsky, and D. C. Page. 2000. Unexpectedly similar rates of nucleotide substitution found in male and female hominids. Nature 406:622–625.

    Bosch, E., M. E. Hurles, A. Navarro, and M. A. Jobling. 2004. Dynamics of a human inter-paralog gene conversion hotspot. Genome Res. 14:835–844.

    Bowcock, A. M., A. Ruiz-Linares, J. Tomfohrde, E. Minch, J. R. Kidd, and L. L. Cavalli-Sforza. 1994. High resolution of human evolutionary trees with polymorphic microsatellites. Nature 368:455–457.

    Budowle, B., R. Chakraborty, A. M. Giusti, A. J. Eisenberg, and R. C. Allen. 1991. Analysis of the VNTR locus D1S80 by the PCR followed by high-resolution PAGE. Am. J. Hum. Genet. 48:137–144.

    Carvalho-Silva, D. R., and S. D. Pena. 2000. Molecular characterization and population study of an X chromosome homolog of the Y-linked microsatellite DYS391. Gene 247:233–240.

    Carvalho-Silva, D. R., F. R. Santos, M. H. Hutz, F. M. Salzano, and S. D. Pena. 1999. Divergent human Y-chromosome microsatellite evolution rates. J. Mol. Evol. 49:204–214.

    Charlesworth, B. 2001. The effect of life-history and mode of inheritance on neutral genetic variability. Genet. Res. 77:153–166.

    Destro-Bisol, G., F. Donati, V. Coia, I. Boschi, F. Verginelli, A. Caglià, S. Tofanelli, G. Spedini, and C. Capelli. 2004. Variation of female and male lineages in sub-Saharan populations: the importance of socio-cultural factors. Mol. Biol. Evol. (in press).

    Dorit R. L., H. Akashi, and W. Gilbert. 1995. Absence of polymorphism at the ZFY locus on the human Y chromosome. Science 268:1183–1185.

    Dupanloup I., L. Pereira, G. Bertorelle, F. Calafell, M. J. Prata, A. Amorim, and G. Barbujani. 2003. A recent shift from polygyny to monogamy in humans is suggested by the analysis of worldwide Y-chromosome diversity. J. Mol. Evol. 57:85–97.

    Dupuy, B. M., T. Gedde-Dahl, and B. Olaisen. 2000. DXYS267: DYS393 and its X chromosome counterpart. Forensic Sci. Int. 112:111–121.

    Forster, P., R. Harding, A. Torroni, and H. J. Bandelt. 1996. Origin and evolution of Native American mtDNA variation: a reappraisal. Am. J. Hum. Genet. 59:935–945.

    Giouzeli, M., N. A. Williams, L. J. Lonie, L. E. DeLisi, and T. J. Crow. 2004. ProtocadherinX/Y, a candidate gene-pair for schizophrenia and schizoaffective disorder: a DHPLC investigation of genomic sequence. Am. J. Med. Genet. 129B:1–9.

    Graves, J. A. 1995. The origin and function of the mammalian Y chromosome and Y-borne genes—an evolving understanding. Bioessays 17:311–320.

    Hammer, M. F. 1995. A recent common ancestry for human Y chromosomes. Nature 378:376–378.

    Jaruzelska, J., E. Zietkiewicz, and D. Labuda. 1999. Is selection responsible for the low level of variation in the last intron of the ZFY locus? Mol. Biol. Evol. 16:1633–1640.

    Jobling, M. A., and C. Tyler-Smith. 2003. The human Y chromosome: an evolutionary marker comes of age. Nat. Rev. Genet. 4:598–612.

    Jorde, L. B., M. J. Bamshad, W. S. Watkins, R. Zenger, A. E. Fraley, P. A. Krakowiak, K. D. Carpenter, H. Soodyall, T. Jenkins, and A. R. Rogers. 1995. Origins and affinities of modern humans: a comparison of mitochondrial and nuclear genetic data. Am. J. Hum. Genet. 57:523–538.

    Kaessmann, H., F. Heissig, A. von Haeseler, and S. Paabo. 1999. DNA sequence variation in a non-coding region of low recombination on the human X chromosome. Nat. Genet. 22:78–81.

    Karafet, T., P. de Knijff, E. Wood, J. Ragland, A. Clark, and M. F. Hammer. 1998. Different patterns of variation at the X- and Y-chromosome-linked microsatellite loci DXYS156X and DXYS156Y in human populations. Hum. Biol. 70:979–992.

    Kayser, M., L. Roewer, M. Hedman et al. (11 co-authors). 2000. Characteristics and frequency of germline mutations at microsatellite loci from the human Y chromosome, as revealed by direct observation in father/son pairs. Am. J. Hum. Genet. 66:1580–1588.

    Ke, X., S. Hunt, W. Tapper et al. (12 co-authors). 2004. The impact of SNP density on fine-scale patterns of linkage disequilibrium. Hum. Mol. Genet. 13:577–588.

    Kersting, C., C. Hohoff, B. Rolf, and B. Brinkmann. 2001. Pentanucleotide short tandem repeat locus DXYS156 displays different patterns of variations in human populations. Croat. Med. J. 42:310–314.

    Lareu, M. V., C. P. Phillips, A. Carracedo, P. J. Lincoln, D. Syndercombe Court, and J. A. Thomson. 1994. Investigation of the STR locus HUMTH01 using PCR and two electrophoresis formats: UK and Galician Caucasian population surveys and usefulness in paternity investigations. Forensic Sci. Int. 66:41–52.

    Lucotte, G., N. Gerard, R. Krishnamoorthy, F. David, A. Aouizerate, and P. Galzot. 1994. Reduced variability in Y-chromosome-specific haplotypes for some Central African populations. Hum. Biol. 66:519–526.

    Majewski, J., and J. Ott. 2000. GT repeats are associated with recombination on human chromosome 22. Genome Res. 10:1108–1114.

    Makova, K. D., and W. H. Li. 2002. Strong male-driven evolution of DNA sequences in humans and apes. Nature 416:624–626.

    Navarro, A., and N. H. Barton. 2003. Chromosomal speciation and molecular divergence-accelerated evolution in rearranged chromosomes. Science 300:321–324.

    Pereira, L., L. Gusm?o, C. Alves, A. Amorim, and M. J. Prata. 2002. Bantu and European Y-lineages in sub-Saharan Africa. Ann. Hum. Genet. 66:369–378.

    Pereira, L., V. Macaulay, A. Torroni, R. Scozzari, M. J. Prata, and A. Amorim. 2001. Prehistoric and historic traces in the mtDNA of Mozambique: insights into the Bantu expansions and the slave trade. Ann. Hum. Genet. 65:439–458.

    Pereira, L., M. J. Prata, M. A. Jobling, and A. Amorim. 2000. Analysis of the Y-chromosome and mitochondrial DNA pools in Portugal. Pp. 191–195 in C. Renfrew and K. Boyle, eds. Archaeogenetics: DNA and the population prehistory of Europe. McDonald Institute for Archeological Research, University of Cambridge, Cambridge, UK.

    Pérez-Lezaun, A., F. Calafell, M. Seielstad, E. Mateu, D. Comas, E. Bosch, and J. Bertranpetit. 1997. Population genetics of Y-chromosome short tandem repeats in humans. J. Mol. Evol. 45:265–270.

    Qamar, R., Q. Ayub, A. Mohyuddin, A. Helgason, K. Mazhar, A. Mansoor, T. Zerjal, C. Tyler-Smith, and S. Q. Mehdi. 2002. Y-chromosomal DNA variation in Pakistan. Am. J. Hum. Genet. 70:1107–1124.

    Reich, D. E., M. Cargill, S. Bolk, and J. Ireland,P. C. Sabeti,D. J. Richter,T. Lavery,R. Kouyoumjian,S. F. Farhadian,R. Ward,E. S. Lander. 2001. Linkage disequilibrium in the human genome. Nature 411:199–204.

    Rieseberg, L. H., and K. Livingstone. 2003. Evolution. Chromosomal speciation in primates. Science 300:267–268.

    Salas, A., M. Richards, T. De la Fe, M. V. Lareu, B. Sobrino, P. Sanchez-Diz, V. Macaulay, and A. Carracedo. 2002. The making of the African mtDNA landscape. Am. J. Hum. Genet. 71:1082–1111.

    Schneider, S., D. Roessli, and L. Excoffier. 2000. Arlequin ver. 2.000 : a software for population genetics data analysis. Genetics and Biometry Laboratory, Department of Anthropology, University of Geneva, Switzerland.

    Schwartz, A., D. C. Chan, L. G. Brown, R. Alagappan, D. Pettay, C. Disteche, B. McGillivray, A. de la Chapelle, and D. C. Page. 1998. Reconstructing hominid Y evolution: X-homologous block, created by X-Y transposition, was disrupted by Yp inversion through LINE-LINE recombination. Hum. Mol. Genet. 7:1–11.

    Scozzari, R., F. Cruciani, P. Malaspina et al. (18 co-authors). 1997. Differential structuring of human populations for homologous X and Y microsatellite loci. Am. J. Hum. Genet. 61:719–733.

    Semino, O., G. Passarino, P. J. Oefner et al. (14 co-authors). 2000. The genetic legacy of Paleolithic Homo sapiens sapiens in extant Europeans: a Y chromosome perspective. Science 290:1155–1159.

    Skaletsky, H., T. Kuroda-Kawaguchi, P. J. Minx et al. (37 co-authors). 2003. The male-specific region of the human Y chromosome is a mosaic of discrete sequence classes. Nature 423:825–837.

    Thomson, R., J. K. Pritchard, P. Shen, P. J. Oefner, and M. W. Feldman. 2000. Recent common ancestry of human Y chromosomes: evidence from DNA sequence data. Proc. Natl. Acad. Sci. 97:7360–7365.

    Underhill, P. A., L. Jin, R. Zemans, P. J. Oefner, and L. L. Cavalli-Sforza. 1996. A pre-Columbian Y chromosome-specific transition and its implications for human evolutionary history. Proc. Natl. Acad. Sci. 93:196–200.

    Vigilant, L., M. Stoneking, H. Harpending, K. Hawkes, and A. C. Wilson. 1991. African populations and the evolution of human mitochondrial DNA. Science 253:1503–1507.

    Zhang, J., W. L. Rowe, A. G. Clark, and K. H. Buetow. 2003. Genomewide distribution of high-frequency, completely mismatching snp haplotype pairs observed to be common across human populations. Am. J. Hum. Genet. 73:1073–1081.

    Zhao, Z., L. Jin, Y. X. Fu et al. (13 co-authors). 2000. Worldwide DNA sequence variation in a 10-kilobase noncoding region on human chromosome 22. Proc. Natl. Acad. Sci. 97:11354–11358.

    Zhivotovsky, L. A., L. Bennett, A. M. Bowcock, and M. W. Feldman. 2000. Human population expansion and microsatellite variation. Mol. Biol. Evol. 17:757–767.

    Zhivotovsky, L. A., P. A. Underhill, C. Cinnioglu et al. (14 co-authors). 2004. The effective mutation rate at Y chromosome short tandem repeats, with application to human population-divergence time. Am. J. Hum. Genet. 74:50–61.(Alexandra M. Lopes*,, Fra)