Male-Biased Mutation Rate and Divergence in Autosomal, Z-Linked and W-Linked Introns of Chicken and Turkey
http://www.100md.com
分子生物学进展 2004年第8期
Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Sweden
E-mail: Hans.Ellegren@ebc.uu.se.
Abstract
To investigate mutation-rate variation between autosomes and sex chromosomes in the avian genome, we have analyzed divergence between chicken (Gallus gallus) and turkey (Meleagris galopavo) sequences from 33 autosomal, 28 Z-linked, and 14 W-linked introns with a total ungapped alignment length of approximately 43,000 bp. There are pronounced differences in the mean divergence among autosomes and sex chromosomes (autosomes [A] = 10.08%, Z chromosome = 10.99%, and W chromosome = 5.74%), and we use these data to estimate the male-to-female mutation-rate ratio (m) from Z/A, Z/W, and A/W comparisons at 1.71, 2.37, and 2.52, respectively. Because the m estimates of the three comparisons do not differ significantly, we find no statistical support for a specific reduction in the Z chromosome mutation rate (Z reduction estimated at 4.89%, P = 0.286). The idea of mutation-rate reduction in the sex chromosome hemizygous in one sex (i.e., X in mammals, Z in birds) has been suggested on the basis of theory on adaptive mutation-rate evolution. If it exists in birds, the effect would, thus, seem to be weak; a preliminary power analysis suggests that it is significantly less than 18%. Because divergence may vary within chromosomal classes as a result of variation in mutation and/or selection, we developed a novel double-bootstrapping method, bootstrapping both by introns and sites from concatenated alignments, to estimate confidence intervals for chromosomal class rates and for m. The narrowest interval for the m estimate is 1.88 to 2.97 from the Z/W comparison. We also estimated m using maximum likelihood on data from all three chromosome classes; this method yielded m = 2.47 and approximate 95% confidence intervals of 2.27 to 2.68. Our data are broadly consistent with the idea that mutation-rate differences between chromosomal classes can be explained by the male mutation bias alone.
Key Words: male-biased mutation ? Z chromosome ? W chromosome ? adaptive mutation rates ? nonparametric bootstrapping
Introduction
How do mutation rates vary within genomes? To date, most vertebrate studies on mutation-rate variation have focused on mammals with increasing evidence of significant local and regional substitution rate variation at putative neutral sites within mammalian chromosomes (Wolfe, Sharp, and Li 1989; Matassi, Sharp, and Gautier 1999; Williams and Hurst 2000; Lercher, Williams, and Hurst 2001; Smith, Webster, and Ellegren 2002; Waterston et al. 2002; Hardison et al. 2003). The causes of this variation are poorly understood (Ellegren, Smith, and Webster 2003), although the observation of covariation of substitution rates in orthologous regions along independent primate lineages shows that regional rate variation is deterministic and repeatable (Smith, Webster, and Ellegren 2002; Hardison et al. 2003). Attempts to explain mutation-rate heterogenity include invoking sequence context effects (Silva and Kondrashow 2002; Zhao and Boerwinkle 2002; Arndt, Petrov, and Hwa 2003; Smith, Webster, and Ellegren 2003), an association between mutation and recombination processes (Lercher and Hurst 2002; Waterston et al. 2002; Hardison et al. 2003; Hellmann et al. 2003), and the evolution of isochores and the correlation between GC content and substitution rates (Eyre-Walker and Hurst 2001). Understanding how and why mutation rates vary is important not only in the contexts of the molecular basis for mutation and genome evolution but also for addressing the possibility for selection to modify mutation rates (mutation-rate evolution [Sniegowski et al. 2000]).
Substitution-rate variation is also seen at the level of individual chromosomes. There are significant differences in the mean substitution rate among autosomes in various mammalian comparisons (Lercher, Willians, and Hurst 2001; Ebersberger et al. 2002). The sex chromosomes show the most extreme variation, with the X chromosome evolving slower than autosomes and the Y chromosome evolving faster than autosomes (Li, Yi, and Makova 2002). At least two factors are thought to affect mutation rate variation between the mammalian sex chromosomes. First, if replication error that occurs during germline cell division is the major mutagenic process, then the much greater number of germline cell divisions in males than in females should increase the Y chromosome mutation rate relative to the X chromosome (Miyata et al. 1987). This difference underlies the argument for a male-biased mutation or male-driven evolution. In primates, molecular evolutionary analyses show Y chromosome divergence to be approxiamtely 2.2 times higher than X chromosome divergence (Shimmin, Chang, and Li 1993; Makova and Li 2002; but see Bohossian, Skaletsky, and Page [2000]), which translates to a male-to-female mutation rate ratio (m) of 4 to 6. In rodents, m is estimated at approximately 2 (Chang et al. 1994) and in goats at 3 to 4 (Lawson and Hewitt 2002), potentially indicating a correlation between generation time and m.
Second, we may expect selection to favor a reduced mutation rate on the X chromosome because of the hemizygous exposure of recessive deleterious mutations (McVean and Hurst 1997). This theory was supported by an early finding that the X-linked synonymous substitution rate in the mouse-rat comparison was reduced to a greater extent than could be caused by a male mutation bias alone (McVean and Hurst 1997). However, the current evidence for such an adaptive reduction in the X-linked mutation rate of mammals is weak. For example, in a human-chimpanzee comparison of genomic sequences, the reduction in X-linked rates can be explained by a male mutation bias and high ancestral polymorphism because confidence intervals of the estimates of the male mutation bias derived from different chromosome comparisons X/A, Y/A, and Y/X overlap (Ebersberger et al. 2002). A similar conclusion was recently reached by Malcom, Wyckoff, and Lahn (2003) from extensive human-mouse and mouse-rat comparisons that used synonymous substitution rates. It is unclear why the results from the two latter studies are not consistent with the observation of McVean and Hurst (1997).
The sex chromosome system of birds (ZZ males and ZW females) offers an interesting contrast to that of mammals, not least because the avian sex chromosomes evolved independently of those in mammals; that is, mammalian X chromosomes and avian Z chromosomes are not syntenic (Fridolfsson et al. 1998; Nanda et al. 1999, 2002). We expect the avian Z chromosome to have an elevated mutation rate caused by the male mutation bias because the Z chromosome spends two-thirds of its time in males, where rates of germline cell division are high. Accordingly, Z should evolve faster than the female-specific W chromosome. This prediction is supported by analyses of substitution rates in gametologous introns shared between the Z and W chromosome of various bird lineages (Ellegren and Fridolfsson 1997; Kahn and Quinn 1999; Carmichael et al. 2000; Fridolfsson and Ellegren 2000; Bartosch-H?rlid et al. 2003); the Z chromosome evolves faster than the W chromosome. There is some variation in the different estimates of avian m (1.7 to 6.5), but the confidence intervals associated with these estimates are large, and so far, all estimates have been based on molecular evolutionary analyses of a limited number of short introns or short coding sequences.
Predictions for an adaptive reduction in X-linked mutation rates apply in the same way to the avian Z chromosome. Because deleterious mutations will be exposed on the Z chromosome when hemizygous in females, selection may favor a reduced mutation rate on this chromosome. The effects of adaptive mutation rates and the male mutation bias are, thus, expected to work in opposition in birds, leading to reduced and increased Z rate, respectively. Note that avian W as well as mammalian Y is always hemizygous but, as will be discussed later, several lines of arguments suggest that there is little potential for adaptive mutation-rate evolution on these chromosomes. The evidence for Z evolving faster than W seems unambiguous, but this does not rule out that the Z rate is lower than what should be expected from male-biased mutation alone.
Fortunately, a quantitative assessment of the role of the male mutation bias and chromosome-specific mutation rates in birds can by obtained from comparisons of substitution rates in autosomes (A), the Z chromosome, and the W chromosome. If the male mutation bias is the main factor governing the mean mutation rate of chromosomes, then estimates of m should be similar from Z/A, Z/W, and A/W comparisons (cf. Miyata et al. 1987; McVean and Hurst 1997; Malcom, Wyckoff, and Lahn 2003). On the other hand, if the Z chromosome mutation rate is specifically reduced, m estimated from A/W should be higher than when estimated from Z/A and Z/W comparisons. To address this issue, we here make a large-scale attempt to analyze substitution rate variation in the avian genome by studying divergence in roughly 43 kb of orthologous, noncoding sequence of chicken (Gallus gallus) and turkey (Meleagris galopavo). We obtain data from 74 different introns on autosomes, Z chromosome, and W chromosome and contrast m estimates from Z/A, Z/W, and A/W comparisons.
Materials and Methods
Collection of Sequence Data
Chicken and turkey intron sequences were derived for autosomal, Z-linked or W-linked genes (map information from ArkDB farm animal database at www.thearkdb.org or Schmid et al. [2000]), with the criterion of using only introns longer than 200 bp to reduce stochastic variation in estimates of divergence. This decision was motivated by the use of a novel bootstrapping method that bootstraps by both introns and sites. Also, excluding short introns may reduce the effect of constraint on small introns that results from conservation of splice sites. Because the exon-intron organization is not given for most avian genes in GenBank, we first Blasted chicken cDNA sequences against the draft human genome at NCBI (July 2003 build). Large gaps in the avian sequence produced in such Blast alignments should represent positions of putative introns; for all genes analyzed in this study, this approach revealed putative avian introns at precisely the same positions as in orthologous human genes. After this procedure, we designed exonic PCR primers for sequencing of both chicken and turkey introns (Appendix 1–3). For a few genes, the full genomic sequence, including exons and introns, was available in chicken. In those cases, exonic primers were designed for amplification in turkey only.
Chicken and turkey DNA was extracted from fresh muscle tissue by standard proteinase K digestion and phenol/chloroform purification, adapted from Hoelzel and Green (1998). PCR reactions were carried out in 50-μl reaction volumes that contained 20 to 250 ng of DNA template, 1xPCR Gold buffer (Applied Biosystems), 0.2 μM of each primer, 2.0 mM MgCl2, 0.2 mM dNTPs (Amersham Pharmacia Biotech Inc), and 1 U Ampli Taq Gold (Applied Biosystems). Amplification reactions were performed using an initial denaturation at 95°C for 5 min, followed by 33 to 40 cycles at 94°C for 30 s and specific annealing and extension conditions for every intron. Amplified fragments were purified using the Qiaquick purification protocol (Qiagen), sequenced using BigDyeTM Terminator Cycle Sequencing chemistry with original primers (Applied Biosystems), and sequences were recorded with an ABI377 semiautomated sequencing instrument (Applied Biosystems). Sites found to be heterozygous in direct sequencing were excluded from analysis. In some cases, sequencing was preceded by cloning, which was done using the pGEM-T vector kit. Sequencing reactions were then initiated with M13 vector primers. Sequence data from this article have been deposited with the GenBank Data Library under accession numbers AF006660, AF526055, AY139836 to AY139865, AY142943 to AY142944, AY144673 to AY144682, AY189754 to AY189777, AY194125 to AY194147, AY298959 to AY299013, AY380785 to AY380789, and AY426725 to AY426737.
Sequence Analysis
Orthologous chicken and turkey introns were aligned by use of ClustalW on default settings (Thompson, Higgins, and Gibson 1994), although some manual adjustment was required to improve the alignment of repetitive sequences. Pairwise distances were estimated by use of the baseml program in PAML version 3.11 (Yang 1997), with the Tamura-Nei (Tamura and Nei 1993) model of sequence evolution. Distances were estimated on the assumption that all sites evolve at the same rate (i.e., no among-site rate variation).
The estimation of confidence intervals and hypothesis testing was carried out by application of nonparametric bootstrapping. We developed a new bootstrapping procedure, termed double bootstrapping. For a given chromosome category (A, Z, or W) we first bootstrapped by introns, randomly sampling introns with replacement to give the same total number of introns as in the original data set, and then, for each of the intron alignments, we bootstrapped by sites, randomly sampling sites with replacement to generate alignments of the same length as the originals. The first stage of the bootstrapping procedure accounts for variation in substitution rates between different introns, as may be caused by regional variation in mutation and/or selection (reviewed in Ellegren, Smith, and Webster [2003]). Our preliminary observations suggest that this variation may be significant in bird genomes and that it is present in autosomes as well as the sex chromosomes (S. Berlin, N. G. C. Smith, and H. Ellegrin, unpublished data). Note, for instance, that the point estimates of divergence in autosomal introns varies between 3.9% and 18.5% (table 1), although these estimates are associated with large confidence intervals. We are currently investigating the causes of this rate heterogeneity. Preliminary analyses suggest that conserved sites or blocks explain part of the variation, but an underlying variation in the mutation rate contributes as well. One important implication of regional substitution-rate variation is that estimates of m can be heavily biased when based on individual introns. The second stage of the bootstrapping accounts for noise generated during the estimation of divergence. Pairwise distances were calculated for each of the alignments after the double bootstrapping, and the unweighted mean of these distances was taken as the output.
Table 1 Data for Autosomal Introns.
The bootstrapping process was repeated 1,000 times, thereby giving 1,000 sets of W-linked, Z-linked, and autosomal divergences from which to estimate the male mutation bias (m) and other rate statistics. The standard deviation of the bootstrap values gives an estimate of the standard error of the bootstrapped statistic (Sokal and Rohlf 1995). Hypothesis testing that required the comparison of rate statistics was performed by direct comparison of randomized bootstrap values.
Results
A set of autosomal, Z-linked and W-linked introns were sequenced and analyzed in chicken and turkey, two species with a divergence time of 28 MYA, estimated by use of mitochondrial DNA–based molecular clocks (Dimcheff, Drovetski, and Mindell 2002). We obtained 33 orthologous autosomal alignments with a total ungapped length of 16,188 bp (table 1), 28 Z-linked alignments with total length 16,079 bp (table 2), and 14 W-linked alignments with total length 10,621 bp (table 3).
Table 2 Data for Z-linked Introns.
Table 3 Data for W-linked introns.
There are pronounced differences in mean divergence among autosomes and sex chromosomes (autosomes [A] = 10.06%, Z chromosome = 10.95%, and W chromosome = 5.71%). When the divergences were calculated using the double-bootstrapping method (see Materials and Methods), we obtained the following values of medians and standard errors for the concatenated alignments: A = 10.08% ± 0.67%, Z = 10.99% ± 0.48%, and W = 5.74% ± 0.49%. Thus, both autosomal and Z-linked sequences seem to evolve at just under twice the rate of W-linked sequences in the chicken–turkey comparison. Figure 1 shows the distribution of bootstrap values for autosomes, the Z chromosome, and the W chromosome with the double-bootstrapping method.
FIG. 1. Histogram of 1,000 bootstrap values of autosomal, W-linked, and Z-linked divergences. Bootstrap values were obtained using the double-bootstrap method (see Materials and Methods)
Divergence data from autosomes and sex chromosomes allows the partitioning of the effects of the male mutation bias and one other factor that affects substitution rates. If substitution rates are solely determined by the male mutation bias, then all three estimates of the male mutation bias (m) based on pairwise comparisons of divergence in pairs of chromosome categories (equations 1 to 3, using the approach of Miyata et al. [1987]) should give the same value.
From equations 1 to 3, we obtain the following m estimates: Z/A = 1.71 (95% confidence interval 0.62 to 7.16), Z/W = 2.37 (1.18 to 2.97), A/W = 2.52 (1.88 to 3.34). The bootstrap distribution of m estimates for each comparison is shown in figure 2.
FIG. 2. Histogram of 1,000 bootstrap values of the male-to-female mutation-rate ratio m(A/W), m(Z/W), and m(Z/A). Bootstrap values were obtained by use of the double-bootstrap method
We can quantify any discrepancies in the three estimates of m by using rearrangements of equations 1 to 3 to predict divergence in one chromosome category, given observed divergences in the two other. Here, we predict divergence on the Z chromosome, given the autosomal and W-linked data. Comparing the expected and observed Z-linked divergences then gives the percentage reduction in the Z-linked substitution rate, termed Zr (equation 4).
Using the double-bootstrapping method, the median value of Zr is 4.89%, with a large bootstrap standard error of 8.50%. So in this approach, there no evidence of a significant discrepancy (P = 0.286; one-tailed probability of Zr 0) from the male mutation bias predictions. Hence, with the present data set, the male-mutation bias is sufficient to explain observed differences in divergence of autosomal, Z-linked, and W-linked sequences in the chicken–turkey comparison.
It is noteworthy that the m estimates from the three possible comparisons remain consistent despite the significant differences in GC content between autosomes and sex chromosomes: autosomes = 46.7%, Z chromosome = 36.3%, and W chromosome = 33.1% (t-tests: W-A, P << 0.0001; Z-A, P << 0.0001; Z-W, P = 0.088). This finding suggests that GC content does not have a strong effect on mean divergence of chromosomal classes, at least not in the sense that it would affect m estimates.
The observed divergences can be used for estimation of absolute substitution rates between chicken and turkey. Using a divergence time of 28 MYA (Dimcheff, Drovetski, and Mindell 2002), we obtain rates of A = 3.6 x 10–9, Z = 3.9 x 10–9 and W = 2.0 x 10–9 substitutions per site per year. These rate estimates may be useful for dating events in bird evolution based on nuclear sequence data, although substitution rates may, of course, vary among avian lineages. In mammals, substitution rates in the human and mouse lineages since the split of primates and rodents have been estimated at 2.2 x 10–9 and 4.5 x 10–9, respectively (Waterston et al. 2002). Note that these latter substitution rates represent the average rates since the time of divergence, and that current rates may differ even more as the difference in generation time between human and most rodents should be more significant now than shortly after divergence (assuming a generation time effect on substitution rates).
It is interesting to note that the substitution rate estimates for the chicken–turkey comparison are very similar to that obtained from restriction site analysis of several galliform species (Helm-Bychowski and Wilson 1986). These authors mapped 161 restriction sites from three autosomal regions and associated estimates of divergence with fossil evidence. Using different calibration points in galliform evolution, they arrived at rate estimates of 3.4 x 10–9 to 4.0 x 10–9 substitutions per site per year. Apparently, the relative early molecular evolutionary work of Helm-Bychowski and Wilson (1986) is in good agreement with data from large-scale DNA sequencing.
Discussion
In this study, we addressed between-chromosome variation in mutation rates in birds. Our approach was to use variation in intronic divergence to infer mutation-rate variation directly. How well justified is this inference? The most likely complication is the effect of selection on intronic sequences in birds. Little is yet known on this subject, but given that comparative studies are uncovering numerous nongenic conserved regions in mammalian genomes (e.g., see Dermitzakis et al. [2002]) and that bird genomes are smaller than mammalian genomes, it seems possible that conserved regulatory sequences may be relatively common in avian genomes. However, given that a large number of introns were analyzed for each chromosome category and that there may by regional mutation-rate variation or local effects of selection, the present data set may better reflect the mean substitution rate of chromosome category than a similar amount of sequence derived from a single region for each category.
The Male Mutation Bias
Our point estimates of m for the chicken–turkey comparison are 1.71, 2.37, or 2.52, depending on whether based on Z/A, Z/W, or A/W comparison. Previous studies of male-biased mutation in birds have revealed m estimates in the range of 1.7 to 6.5 (Ellegren and Fridolfsson 1997, Kahn and Quin 1999, Carmichael et al. 2000, Fridolfsson and Ellegren 2000, Bartosch-H?rlid et al. 2003), so our point estimates are in the lower range of reported values (table 4). However, most of the estimates are associated with large confidence intervals, and the confidence intervals obtained in this study (0.62 to 7.16, 1.18 to 2.97, and 1.88 to 3.34) generally overlap with those of previous studies.
Table 4 Estimates of m in Birds.
If we assume a single true value of m, then we can estimate it by combining data from autosomes, the Z chromosome, and the W chromosome. For each intron, the observed number of substitutions [O(S)] is given by the ungapped length of the alignment multiplied by the divergence. Note that although we refer to these data as "observed," strictly speaking they are inferred from the alignments, but here we assume perfect inference of divergence. Then assuming a single m, we can calculate the expected number of substitutions, E(S), for each intron as the product of ungapped alignment length, a chromosome-specific scaling factor, and a normalizing factor that ensures the sum of expected substitutions equals the sum of observed substitutions. The chromosome-specific scaling factors (K) reflect the proportion of time spent in the male and female germlines: KW = 1, KA = (1+ m)/2, and KZ = (1+2m)/3. The log-likelihood of m can be approximated by the G-test statistic (page 692 in Sokal and Rohlf [1995]), which uses the O(S) and E(S) values to generate a maximum-likelihood estimate of m = 2.47. Approximate 95% confidence intervals can be estimated as the range of m for which the log-likelihood is within 2 units of the maximum, which yields a range of 2.27 to 2.68.
Because the amount of sequence data in the present study exceeds that of earlier avian studies by one or two orders of magnitude, and the data are based on autosomes as well as sex chromosomes, our maximum-likelihood m estimate may be viewed as the most accurate estimate yet obtained for birds. This contention is substantiated by the fact that our estimate is based upon sequence data from a large number of regions from each chromosome category. Regional mutation-rate variation would make m estimates sensitive to the particular regions used for molecular evolutionary analysis. Previous studies of male-biased mutation in birds (e.g., Ellegren and Fridolfsson [1997], Kahn and Quinn [1999], Carmichael et al. [2000], and Bartosch-H?rlid et al. [2003]), as well as many studies in other organisms (e.g., Shimmin, Chang, and Li [1993], Bohossian, Skaletsky, and Page [2000], and Makova and Li [2002]), have been based on one or a few genomic regions only.
Given the overlap in confidence intervals between the present m estimates in the chicken–turkey comparison and those obtained in studies of other bird species, it would be premature to conclude that the male mutation bias is lower in galliforms than in other birds. On the other hand, a low-point estimate of 1.8 was independently obtained for chicken and turkey using ATP5A1Z/ATP5A1W intron sequences (Carmichael et al. 2000), and the same estimate was obtained in a three-species comparison that included one anseriform and two other galliform species using CHD1Z/CHD1W introns (Bartosch-H?rlid et al. 2003). We have recently found evidence of m being higher in avian lineages with longer generation time and with higher intensity of sexual selection, which suggests a link between life-history characteristics and the male mutation bias (Bartosch-H?rlid et al. 2003). Most galliforms breed at the age of 1 year, so a rather weak male mutation bias would be consistent with a generation time effect.
A potential problem in estimation of the male mutation bias from sex-linked sequences is the effect of ancestral polymorphism, which can bias estimates of m when distances are low and lineage sorting is incomplete (Makova and Li 2002; Ellegren 2002a). However, the effect of ancestral polymorphism in our study is expected to be minimal because all pairwise distances between chicken and turkey are relatively high (5% to 11%), and with a divergence time of 28 MYA, lineage sorting should have been completed long ago. Makova and Li (2002) found ancient polymorphism to affect estimates of m in the human–chimpanzee comparison (1% divergence) but not in comparisons of human and more distantly related primates.
Is There a Reduction in the Z Chromosome Mutation Rate?
Our results indicated that in birds, the Z-linked introns evolve slightly faster than autosomal introns, which, in turn, evolve much faster than W-linked introns. These qualitative findings are in keeping with the male mutation bias predictions of Miyata et al. (1987), and there is, thus, no apparent need to invoke factors additional to the male mutation bias to explain variation in divergence among chromosomal classes. Alternatively, we can view our results as indicating that such potential factors, if they exist, must be weak, which is opposite to some previous suggestions (McVean and Hurst 1997).
There is the theoretical possibility that the increased efficacy of selection against slightly deleterious mutations on the avian Z chromosome relative to the autosomes (Charlesworth, Coyne, and Barton 1987) could reduce the mutation rate on the Z chromosome relative to null expectations. However, a quantitative analysis provides confidence that weak selection is unlikely to be responsible, for example, for a 5% reduction (the estimated value of Zr) in the Z chromosome mutation rate. Irrespective of the nature of dosage compensation in birds (see Ellegren [2002b]), the expected substitution rate of autosomal relative to Z-linked sequences, RA/Z, is given by (see equations 8a and 9 in Charlesworth, Coyne, and Barton [1987]):
Even if all slightly deleterious mutations are recessive (McVean and Charlesworth 1999), a conservative assumption with respect to the strength of selection required, then equation 5 shows that a 5% reduction in Z-linked divergence requires the magnitude of Ns, the product of the selective coefficient of mutations and the effective population size, to be unrealistically high.
We conclude that some theoretical arguments as well as our empirical observations do not support a significantly reduced Z-chromosome mutation rate. On the other hand, failure to demonstrate an effect does not mean that it does not exist. Our data set may simply have been too small to allow detection of a modest reduction in Z rate. Additional data are, thus, needed to firmly settle the question and should be accompanied by a power analysis to reveal what minimum reduction in the Z-chromosome mutation rate would be detectable with the data. Such an analysis would require a deeper insight into the patterns and causes of substitution rate heterogeneity among introns or other chromosomal regions. However, as a preliminary analysis of the power of our data set to determine the maximum value of a putative Z reduction, we took the 95% percentile of the 1,000 double-bootstrap estimates of Zr. This method indicated that Zr is significantly less than 18%. We note that this value is considerably less than the point estimate of Zr = 30% in rodents obtained from the X and A substitution data of McVean and Hurst (1997) combined with the assumption of a male-to-female mutation bias of 2 in rodents (Chang et al. 1994).
Why Not a Reduced W-Chromosome Mutation Rate?
It should be noted that we may as well had predicted divergence on the W chromosome, given the observed autosomal and Z-linked rates. Similarly, in theory, we could have inferred the reduction in W-linked substitution rates, Wr, by comparison of expected and observed divergences (table 4). Using the present data set, the outcome of such an analysis would have been the same as the analysis of a possible reduction on Z; all comparisons give the same one-tailed probability (P = 0.286) and, thus, provide no statistical support for deviations from expectations from the male mutation bias. However, the question of which approach is the correct one is still warranted from a general perspective: Shall we rely on the observed W rate to calculate the expected Z rate (via comparison with autosomes), or shall we use the observed Z rate to predict the rate on W? Put in other words, should potential discrepancies between the three ways of estimating m be interpreted as a reduction in the substitution rate on the Z chromosome or as a reduction in the substitution rate on the W chromosome (or as some combination of the two factors)? In fact, the corresponding question applies to studies of mammalian sex chromosomes: Shall the observed Y rate be used to predict the rate on X, or shall the observed X rate be used to calculated the expected rate on Y? Previous work in this field has ignored the latter possibility and only addressed the possible reduction in X-chromosome mutation rate (McVean and Hurst 1997; Nachman and Crowell 2000; Ebersberger et al. 2002; Malcom, Wyckoff, and Lahn 2003).
There is no way to differentiate between these possibilities with present data, because three chromosome categories means only two degrees of freedom, and one of those is used to estimate the male mutation bias. If we had some way of knowing the "true" male mutation bias, we could use the extra degree of freedom to differentiate between a W reduction and a Z reduction by seeing which of the Z/A and A/W comparisons gave the m estimate closest to the true value. However, evolutionary theory may help to resolve the issue. The argument for a reduction in the Z chromosome mutation rate (McVean and Hurst 1997) relies on cost-benefit considerations of adaptive mutation rates. Given that the Z chromosome contains many more genes than the tiny W chromosome (probably by at least two orders of magnitude [Ellegren 2000; Schmid et al. 2000]), the strength of selection for a reduced mutation rate should be expected to be much higher on Z than on W. Moreover, note that the benefit of a reduced mutation rate is the avoidance of deleterious mutations being exposed in a hemizygote chromosome. Although W is always hemizygous, a majority of the avian W-linked genes so far characterized have highly similar, and likely functionally equivalent, homologs on Z (Ellegren 2002b). Recessive mutations in these W-linked genes are likely to be masked by the gametologous Z-linked genes, thereby giving little benefit of mutation-rate reduction.
The Double-Bootstrapping Method
Estimating the confidence intervals of divergence measures by bootstrapping individual sites from concatenated data sets is commonly applied in molecular evolutionary studies. However, the increasing support for regional mutation-rate variation within genomes (reviewed in Ellegren,. Smith, and Webster [2003]), which includes our preliminary data for birds, necessitated an adjustment in the method for estimating the confidence intervals of chromosomal class divergences. Simply concatenating all alignments and then bootstrapping by sampling with replacement from all sites is not sufficient, because it does not account for the variation generated by the choice of a limited number of introns. To account for regional variation, we developed the double-bootstrapping method, which bootstraps by introns and by sites within intronic alignments and then takes the unweighted mean of the bootstrapped alignments. With this method, we found no statistical support for a specific reduction of the Z-chromosome mutation rate. Had we only bootstrapped by sites, slightly different divergence medians and significantly different standard errors would have been obtained (A = 10.27% ± 0.28%, Z = 10.89% ± 0.28%, and W = 5.66% ± 0.26%). Note that the much lower standard errors compared with the double-bootstrapping procedure is a consequence of not taking rate variation among introns into account. Importantly, with this approach, there would have been significant support for a reduction in the Z-chromosome mutation rate (Zr = 7.88%, P = 0.026). We believe this observation is one of general significance and that it calls for careful statistical treatment of molecular evolutionary data sets in the presence of underlying rate heterogeneity.
Acknowledgements
Financial support was obtained from the Swedish Research Council. H.E. is a Royal Academy of Sciences Research Fellow supported by the Knut and Alice Wallenberg Foundation. We thank Scott Edwards and two anonymous reviewers for useful comments.
Literature Cited
Arndt, P. F., D. A. Petrov, and T. Hwa. 2003. Distinct changes of genomic biases in nucleotide substitution at the time of mammalian radiation. Mol. Biol. Evol. 20:1887-1896.
Bartosch-H?rlid, A., S. Berlin, N. G. C. Smith, A. P. M?ller, and H. Ellegren. 2003. Life history and the male mutation bias. Evolution 57:2398-2406.
Bohossian, H. B., H. Skaletsky, and D. C. Page. 2000. Unexpectedly similar rates of nucleotide substitution found in male and female hominids. Nature 406:622-625.
Carmichael, A. N., A. K. Fridolfsson, J. Halverson, and H. Ellegren. 2000. Male-biased mutation rates revealed from Z and W chromosome-linked ATP synthase alpha-subunit (ATP5A1) sequences in birds. J. Mol. Evol. 50:443-447.
Chang, B. H. J., L. C. Shimmin, S. K. Shyue, D. Hewett-Emmett, and W. H. Li. 1994. Weak male-driven evolution in rodents. Proc. Natl. Acad. Sci. USA 91:827-831.
Charlesworth, B., J. A. Coyne, and N. H. Barton. 1987. The relative rates of evolution of sex chromosomes and autosomes. Am. Nat. 130:113-146.
Dermitzakis, E. T., A. Reymond, and R. Lyle, et al. (11 co-authors). 2002. Numerous potentially functional but non-genic conserved sequences on human chromosome 21. Nature 420:578-582.
Dimcheff, D. E., S. V. Drovetski, and D. P. Mindell. 2002. Phylogeny of Tetraoninae and other galliform birds using mitochondrial 12S and ND2 genes. Mol. Phyl. Evol. 24:203-215.
Ebersberger, I., D. Metzler, C. Schwarz, and S. Paabo. 2002. Genomewide comparison of DNA sequences between humans and chimpanzees. Am. J. Hum. Genet. 70:1490-1497.
Ellegren, H. 2000. Evolution of the avian sex chromosomes and their role in sex determination. Trends Ecol. Evol. 15:188-192.
Ellegren, H. 2002a. Human mutation: blame (mostly) men. Nat. Genet. 31:9-10.
Ellegren, H. 2002b. Dosage compensation: do birds do it as well? Trends Genet. 18:25-28.
Ellegren, H., and A. K. Fridolfsson. 1997. Male-driven evolution of DNA sequences in birds. Nat. Genet. 17:182-184.
Ellegren, H., N. G. C. Smith, and M. T. Webster. 2003. Mutation rate variation in the mammalian genome. Curr. Opin. Genet. Dev. 13:562-568.
Eyre-Walker, A., and L. D. Hurst. 2001. The evolution of isochores. Nat. Rev. Genet. 2:549-555.
Fridolfsson, A. K., H. Cheng, and N. G. Copeland, et al. (10 co-authors). 1998. Evolution of the avian sex chromosomes from an ancestral pair of autosomes. Proc. Natl. Acad. Sci. USA 95:8147-8152.
Fridolfsson, A. K., and H. Ellegren. 2000. Molecular evolution of the avian CHD1 genes on the Z and W sex chromosomes. Genetics 155:1903-1912.
Hardison, R. C., K. M. Roskin, and S. Yang, et al. (18 co-authors). 2003. Covariation in frequencies of substitution, deletion, transposition, and recombination during eutherian evolution. Genome Res. 13:13-26.
Hellmann, I., I. Ebersberger, S. E. Ptak, S. P??bo, and M. Przeworski. 2003. A neutral explanation for the correlation of diversity with recombination rates in humans. Am. J. Hum. Genet. 72:1527-1435.
Helm-Bychowski, K. M., and A. C. Wilson. 1986. Rates of nuclear DNA evolution in pheasant-like birds: evidence from restriction maps. Proc. Natl. Acad. Sci. USA 83:688-692.
Hoelzel, A. R., and A. Green. 1998. PCR protocols and population analysis by direct DNA sequencing and PCR-based DNA fingerprinting. Pp. 201–233 in A. R. Hoelzel, ed. Molecular genetic analysis of populations: a practical approach. Oxford University Press, Oxford, UK.
Kahn, N. W., and T. W. Quinn. 1999. Male-driven evolution among Eoaves? A test of the replicative division hypothesis in a heterogametic female (ZW) system. J. Mol. Evol. 49:750-759.
Lawson, L. J., and G. M. Hewitt. 2002. Comparison of substitution rates in ZFX and ZFY introns of sheep and goat related species supports the hypothesis of male-biased mutation rates. J. Mol. Evol. 54:54-61.
Lercher, M. J., and L. D. Hurst. 2002. Human SNP variability and mutation rate are higher in regions of high recombination. Trends Genet. 18:337-340.
Lercher, M. J., E. J. B. Williams, and L. D. Hurst. 2001. Local similarity in evolutionary rates extends over whole chromosomes in human-rodent and mouse-rat comparisons: implications for understanding the mechanistic basis of the male mutation bias. Mol. Biol. Evol. 18:2032-2039.
Li, W. H., S. J. Yi, and K. Makova. 2002. Male-driven evolution. Current Opin. Genet. Dev. 12:650-656.
Makova, K. D., and W. H. Li. 2002. Strong male-driven evolution of DNA sequences in humans and apes. Nature 416:624-626.
Malcom, C. M., G. J. Wyckoff, and B. T. Lahn. 2003. Genic mutation rates in mammals: local similarity, chromosomal heterogeneity, and X-versus-autosome disparity. Mol. Biol. Evol. 20:1633-1641.
Matassi, G., P. M. Sharp, and C. Gautier. 1999. Chromosomal location effects on gene sequence evolution in mammals. Curr. Biol. 9:786-791.
McVean, G. A. T., and B. Charlesworth. 1999. A population genetic model for the evolution of synonymous codon usage: patterns and predictions. Genet. Res. 74:145.
McVean, G. T., and L. D. Hurst. 1997. Evidence for a selectively favourable reduction in the mutation rate of the X chromosome. Nature 386:388-392.
Miyata, T., H. Hayashida, K. Kuma, K. Mitsuyasu, and T. Yasunaga. 1987. Male-driven molecular evolution: a model and nucleotide sequence analysis. Cold Spring Harbor Symp. Quant. Biol. 52:863-867.
Nachman, M. W., and S. L. Crowell. 2000. Estimate of the mutation rate per nucleotide in humans. Genetics 156:297-304.
Nanda, I., Z. Shan, and M. Schartl, et al. (15 co-authors). 1999. 300 million years of conserved synteny between chicken Z and human chromosome 9. Nat. Genet. 21:258-259.
Nanda, I., T. Haaf, M. Schartl, M. Schmid, and D. W. Burt. 2002. Comparative mapping of Z-orthologous genes in vertebrates: implications for the evolution of avian sex chromosomes. Cytogenet. Genome Res. 99:178-184.
Schmid, M., I. Nanda, and M. Guttenbach, et al. (34 co-authors). 2000. First report on chicken genes and chromosomes 2000. Cytogenet. Cell Genet. 90:169-218.
Shimmin, L. C., B. H. J. Chang, and W. H. Li. 1993. Male-driven evolution of DNA sequences. Nature 362:745-747.
Silva, J. C., and A. S. Kondrashov. 2002. Patterns in spontaneous mutation revealed by human-baboon sequence comparison. Trends Genet. 18:544-547.
Smith, N. G. C., M. T. Webster, and H. Ellegren. 2002. Deterministic mutation rate variation in the human genome. Genome Res. 12:1350-1356.
Smith, N. G. C., M. T. Webster, and H. Ellegren. 2003. A low rate of simultaneous double-nucleotide mutations in primates. Mol. Biol. Evol. 20:47-53.
Sniegowski, P. D., P. J. Gerrish, T. Johnson, and A. Shaver. 2000. The evolution of mutation rates: separating causes from consequences. Bioessays 22:1057-1066.
Sokal, R. R., and F. J. Rohlf. 1995. Biometry. W.H. Freeman, New York.
Tamura, K., and M. Nei. 1993. Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol. Biol. Evol. 10:512-526.
Thompson, J. D., D. G. Higgins, and T. J. Gibson. 1994. ClustalW—improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22:4673-4680.
Waterston, R. H., K. Lindblad-Toh, and E. Birney, et al. (222 co-authors). 2002. Initial sequencing and comparative analysis of the mouse genome. Nature 420:520-562.
Williams, E. J. B., and L. D. Hurst. 2000. The proteins of linked genes evolve at similar rates. Nature 407:900-903.
Wolfe, K. H., P. M. Sharp, and W. H. Li. 1989. Mutation rates differ among regions of the mammalian genome. Nature 337:283-285.
Yang, Z. 1997. PAML: a program package for phylogenetic analysis by maximum likelihood. Comput. Appl. Biosci. 13:555-556.
Zhao, Z., and E. Boerwinkle. 2002. Neighboring-nucleotide effects on single nucleotide polymorphisms: a study of 2.6 million polymorphisms across the human genome. Genome Res. 12:1679-1686.(Erik Axelsson, Nick G.C. )
E-mail: Hans.Ellegren@ebc.uu.se.
Abstract
To investigate mutation-rate variation between autosomes and sex chromosomes in the avian genome, we have analyzed divergence between chicken (Gallus gallus) and turkey (Meleagris galopavo) sequences from 33 autosomal, 28 Z-linked, and 14 W-linked introns with a total ungapped alignment length of approximately 43,000 bp. There are pronounced differences in the mean divergence among autosomes and sex chromosomes (autosomes [A] = 10.08%, Z chromosome = 10.99%, and W chromosome = 5.74%), and we use these data to estimate the male-to-female mutation-rate ratio (m) from Z/A, Z/W, and A/W comparisons at 1.71, 2.37, and 2.52, respectively. Because the m estimates of the three comparisons do not differ significantly, we find no statistical support for a specific reduction in the Z chromosome mutation rate (Z reduction estimated at 4.89%, P = 0.286). The idea of mutation-rate reduction in the sex chromosome hemizygous in one sex (i.e., X in mammals, Z in birds) has been suggested on the basis of theory on adaptive mutation-rate evolution. If it exists in birds, the effect would, thus, seem to be weak; a preliminary power analysis suggests that it is significantly less than 18%. Because divergence may vary within chromosomal classes as a result of variation in mutation and/or selection, we developed a novel double-bootstrapping method, bootstrapping both by introns and sites from concatenated alignments, to estimate confidence intervals for chromosomal class rates and for m. The narrowest interval for the m estimate is 1.88 to 2.97 from the Z/W comparison. We also estimated m using maximum likelihood on data from all three chromosome classes; this method yielded m = 2.47 and approximate 95% confidence intervals of 2.27 to 2.68. Our data are broadly consistent with the idea that mutation-rate differences between chromosomal classes can be explained by the male mutation bias alone.
Key Words: male-biased mutation ? Z chromosome ? W chromosome ? adaptive mutation rates ? nonparametric bootstrapping
Introduction
How do mutation rates vary within genomes? To date, most vertebrate studies on mutation-rate variation have focused on mammals with increasing evidence of significant local and regional substitution rate variation at putative neutral sites within mammalian chromosomes (Wolfe, Sharp, and Li 1989; Matassi, Sharp, and Gautier 1999; Williams and Hurst 2000; Lercher, Williams, and Hurst 2001; Smith, Webster, and Ellegren 2002; Waterston et al. 2002; Hardison et al. 2003). The causes of this variation are poorly understood (Ellegren, Smith, and Webster 2003), although the observation of covariation of substitution rates in orthologous regions along independent primate lineages shows that regional rate variation is deterministic and repeatable (Smith, Webster, and Ellegren 2002; Hardison et al. 2003). Attempts to explain mutation-rate heterogenity include invoking sequence context effects (Silva and Kondrashow 2002; Zhao and Boerwinkle 2002; Arndt, Petrov, and Hwa 2003; Smith, Webster, and Ellegren 2003), an association between mutation and recombination processes (Lercher and Hurst 2002; Waterston et al. 2002; Hardison et al. 2003; Hellmann et al. 2003), and the evolution of isochores and the correlation between GC content and substitution rates (Eyre-Walker and Hurst 2001). Understanding how and why mutation rates vary is important not only in the contexts of the molecular basis for mutation and genome evolution but also for addressing the possibility for selection to modify mutation rates (mutation-rate evolution [Sniegowski et al. 2000]).
Substitution-rate variation is also seen at the level of individual chromosomes. There are significant differences in the mean substitution rate among autosomes in various mammalian comparisons (Lercher, Willians, and Hurst 2001; Ebersberger et al. 2002). The sex chromosomes show the most extreme variation, with the X chromosome evolving slower than autosomes and the Y chromosome evolving faster than autosomes (Li, Yi, and Makova 2002). At least two factors are thought to affect mutation rate variation between the mammalian sex chromosomes. First, if replication error that occurs during germline cell division is the major mutagenic process, then the much greater number of germline cell divisions in males than in females should increase the Y chromosome mutation rate relative to the X chromosome (Miyata et al. 1987). This difference underlies the argument for a male-biased mutation or male-driven evolution. In primates, molecular evolutionary analyses show Y chromosome divergence to be approxiamtely 2.2 times higher than X chromosome divergence (Shimmin, Chang, and Li 1993; Makova and Li 2002; but see Bohossian, Skaletsky, and Page [2000]), which translates to a male-to-female mutation rate ratio (m) of 4 to 6. In rodents, m is estimated at approximately 2 (Chang et al. 1994) and in goats at 3 to 4 (Lawson and Hewitt 2002), potentially indicating a correlation between generation time and m.
Second, we may expect selection to favor a reduced mutation rate on the X chromosome because of the hemizygous exposure of recessive deleterious mutations (McVean and Hurst 1997). This theory was supported by an early finding that the X-linked synonymous substitution rate in the mouse-rat comparison was reduced to a greater extent than could be caused by a male mutation bias alone (McVean and Hurst 1997). However, the current evidence for such an adaptive reduction in the X-linked mutation rate of mammals is weak. For example, in a human-chimpanzee comparison of genomic sequences, the reduction in X-linked rates can be explained by a male mutation bias and high ancestral polymorphism because confidence intervals of the estimates of the male mutation bias derived from different chromosome comparisons X/A, Y/A, and Y/X overlap (Ebersberger et al. 2002). A similar conclusion was recently reached by Malcom, Wyckoff, and Lahn (2003) from extensive human-mouse and mouse-rat comparisons that used synonymous substitution rates. It is unclear why the results from the two latter studies are not consistent with the observation of McVean and Hurst (1997).
The sex chromosome system of birds (ZZ males and ZW females) offers an interesting contrast to that of mammals, not least because the avian sex chromosomes evolved independently of those in mammals; that is, mammalian X chromosomes and avian Z chromosomes are not syntenic (Fridolfsson et al. 1998; Nanda et al. 1999, 2002). We expect the avian Z chromosome to have an elevated mutation rate caused by the male mutation bias because the Z chromosome spends two-thirds of its time in males, where rates of germline cell division are high. Accordingly, Z should evolve faster than the female-specific W chromosome. This prediction is supported by analyses of substitution rates in gametologous introns shared between the Z and W chromosome of various bird lineages (Ellegren and Fridolfsson 1997; Kahn and Quinn 1999; Carmichael et al. 2000; Fridolfsson and Ellegren 2000; Bartosch-H?rlid et al. 2003); the Z chromosome evolves faster than the W chromosome. There is some variation in the different estimates of avian m (1.7 to 6.5), but the confidence intervals associated with these estimates are large, and so far, all estimates have been based on molecular evolutionary analyses of a limited number of short introns or short coding sequences.
Predictions for an adaptive reduction in X-linked mutation rates apply in the same way to the avian Z chromosome. Because deleterious mutations will be exposed on the Z chromosome when hemizygous in females, selection may favor a reduced mutation rate on this chromosome. The effects of adaptive mutation rates and the male mutation bias are, thus, expected to work in opposition in birds, leading to reduced and increased Z rate, respectively. Note that avian W as well as mammalian Y is always hemizygous but, as will be discussed later, several lines of arguments suggest that there is little potential for adaptive mutation-rate evolution on these chromosomes. The evidence for Z evolving faster than W seems unambiguous, but this does not rule out that the Z rate is lower than what should be expected from male-biased mutation alone.
Fortunately, a quantitative assessment of the role of the male mutation bias and chromosome-specific mutation rates in birds can by obtained from comparisons of substitution rates in autosomes (A), the Z chromosome, and the W chromosome. If the male mutation bias is the main factor governing the mean mutation rate of chromosomes, then estimates of m should be similar from Z/A, Z/W, and A/W comparisons (cf. Miyata et al. 1987; McVean and Hurst 1997; Malcom, Wyckoff, and Lahn 2003). On the other hand, if the Z chromosome mutation rate is specifically reduced, m estimated from A/W should be higher than when estimated from Z/A and Z/W comparisons. To address this issue, we here make a large-scale attempt to analyze substitution rate variation in the avian genome by studying divergence in roughly 43 kb of orthologous, noncoding sequence of chicken (Gallus gallus) and turkey (Meleagris galopavo). We obtain data from 74 different introns on autosomes, Z chromosome, and W chromosome and contrast m estimates from Z/A, Z/W, and A/W comparisons.
Materials and Methods
Collection of Sequence Data
Chicken and turkey intron sequences were derived for autosomal, Z-linked or W-linked genes (map information from ArkDB farm animal database at www.thearkdb.org or Schmid et al. [2000]), with the criterion of using only introns longer than 200 bp to reduce stochastic variation in estimates of divergence. This decision was motivated by the use of a novel bootstrapping method that bootstraps by both introns and sites. Also, excluding short introns may reduce the effect of constraint on small introns that results from conservation of splice sites. Because the exon-intron organization is not given for most avian genes in GenBank, we first Blasted chicken cDNA sequences against the draft human genome at NCBI (July 2003 build). Large gaps in the avian sequence produced in such Blast alignments should represent positions of putative introns; for all genes analyzed in this study, this approach revealed putative avian introns at precisely the same positions as in orthologous human genes. After this procedure, we designed exonic PCR primers for sequencing of both chicken and turkey introns (Appendix 1–3). For a few genes, the full genomic sequence, including exons and introns, was available in chicken. In those cases, exonic primers were designed for amplification in turkey only.
Chicken and turkey DNA was extracted from fresh muscle tissue by standard proteinase K digestion and phenol/chloroform purification, adapted from Hoelzel and Green (1998). PCR reactions were carried out in 50-μl reaction volumes that contained 20 to 250 ng of DNA template, 1xPCR Gold buffer (Applied Biosystems), 0.2 μM of each primer, 2.0 mM MgCl2, 0.2 mM dNTPs (Amersham Pharmacia Biotech Inc), and 1 U Ampli Taq Gold (Applied Biosystems). Amplification reactions were performed using an initial denaturation at 95°C for 5 min, followed by 33 to 40 cycles at 94°C for 30 s and specific annealing and extension conditions for every intron. Amplified fragments were purified using the Qiaquick purification protocol (Qiagen), sequenced using BigDyeTM Terminator Cycle Sequencing chemistry with original primers (Applied Biosystems), and sequences were recorded with an ABI377 semiautomated sequencing instrument (Applied Biosystems). Sites found to be heterozygous in direct sequencing were excluded from analysis. In some cases, sequencing was preceded by cloning, which was done using the pGEM-T vector kit. Sequencing reactions were then initiated with M13 vector primers. Sequence data from this article have been deposited with the GenBank Data Library under accession numbers AF006660, AF526055, AY139836 to AY139865, AY142943 to AY142944, AY144673 to AY144682, AY189754 to AY189777, AY194125 to AY194147, AY298959 to AY299013, AY380785 to AY380789, and AY426725 to AY426737.
Sequence Analysis
Orthologous chicken and turkey introns were aligned by use of ClustalW on default settings (Thompson, Higgins, and Gibson 1994), although some manual adjustment was required to improve the alignment of repetitive sequences. Pairwise distances were estimated by use of the baseml program in PAML version 3.11 (Yang 1997), with the Tamura-Nei (Tamura and Nei 1993) model of sequence evolution. Distances were estimated on the assumption that all sites evolve at the same rate (i.e., no among-site rate variation).
The estimation of confidence intervals and hypothesis testing was carried out by application of nonparametric bootstrapping. We developed a new bootstrapping procedure, termed double bootstrapping. For a given chromosome category (A, Z, or W) we first bootstrapped by introns, randomly sampling introns with replacement to give the same total number of introns as in the original data set, and then, for each of the intron alignments, we bootstrapped by sites, randomly sampling sites with replacement to generate alignments of the same length as the originals. The first stage of the bootstrapping procedure accounts for variation in substitution rates between different introns, as may be caused by regional variation in mutation and/or selection (reviewed in Ellegren, Smith, and Webster [2003]). Our preliminary observations suggest that this variation may be significant in bird genomes and that it is present in autosomes as well as the sex chromosomes (S. Berlin, N. G. C. Smith, and H. Ellegrin, unpublished data). Note, for instance, that the point estimates of divergence in autosomal introns varies between 3.9% and 18.5% (table 1), although these estimates are associated with large confidence intervals. We are currently investigating the causes of this rate heterogeneity. Preliminary analyses suggest that conserved sites or blocks explain part of the variation, but an underlying variation in the mutation rate contributes as well. One important implication of regional substitution-rate variation is that estimates of m can be heavily biased when based on individual introns. The second stage of the bootstrapping accounts for noise generated during the estimation of divergence. Pairwise distances were calculated for each of the alignments after the double bootstrapping, and the unweighted mean of these distances was taken as the output.
Table 1 Data for Autosomal Introns.
The bootstrapping process was repeated 1,000 times, thereby giving 1,000 sets of W-linked, Z-linked, and autosomal divergences from which to estimate the male mutation bias (m) and other rate statistics. The standard deviation of the bootstrap values gives an estimate of the standard error of the bootstrapped statistic (Sokal and Rohlf 1995). Hypothesis testing that required the comparison of rate statistics was performed by direct comparison of randomized bootstrap values.
Results
A set of autosomal, Z-linked and W-linked introns were sequenced and analyzed in chicken and turkey, two species with a divergence time of 28 MYA, estimated by use of mitochondrial DNA–based molecular clocks (Dimcheff, Drovetski, and Mindell 2002). We obtained 33 orthologous autosomal alignments with a total ungapped length of 16,188 bp (table 1), 28 Z-linked alignments with total length 16,079 bp (table 2), and 14 W-linked alignments with total length 10,621 bp (table 3).
Table 2 Data for Z-linked Introns.
Table 3 Data for W-linked introns.
There are pronounced differences in mean divergence among autosomes and sex chromosomes (autosomes [A] = 10.06%, Z chromosome = 10.95%, and W chromosome = 5.71%). When the divergences were calculated using the double-bootstrapping method (see Materials and Methods), we obtained the following values of medians and standard errors for the concatenated alignments: A = 10.08% ± 0.67%, Z = 10.99% ± 0.48%, and W = 5.74% ± 0.49%. Thus, both autosomal and Z-linked sequences seem to evolve at just under twice the rate of W-linked sequences in the chicken–turkey comparison. Figure 1 shows the distribution of bootstrap values for autosomes, the Z chromosome, and the W chromosome with the double-bootstrapping method.
FIG. 1. Histogram of 1,000 bootstrap values of autosomal, W-linked, and Z-linked divergences. Bootstrap values were obtained using the double-bootstrap method (see Materials and Methods)
Divergence data from autosomes and sex chromosomes allows the partitioning of the effects of the male mutation bias and one other factor that affects substitution rates. If substitution rates are solely determined by the male mutation bias, then all three estimates of the male mutation bias (m) based on pairwise comparisons of divergence in pairs of chromosome categories (equations 1 to 3, using the approach of Miyata et al. [1987]) should give the same value.
From equations 1 to 3, we obtain the following m estimates: Z/A = 1.71 (95% confidence interval 0.62 to 7.16), Z/W = 2.37 (1.18 to 2.97), A/W = 2.52 (1.88 to 3.34). The bootstrap distribution of m estimates for each comparison is shown in figure 2.
FIG. 2. Histogram of 1,000 bootstrap values of the male-to-female mutation-rate ratio m(A/W), m(Z/W), and m(Z/A). Bootstrap values were obtained by use of the double-bootstrap method
We can quantify any discrepancies in the three estimates of m by using rearrangements of equations 1 to 3 to predict divergence in one chromosome category, given observed divergences in the two other. Here, we predict divergence on the Z chromosome, given the autosomal and W-linked data. Comparing the expected and observed Z-linked divergences then gives the percentage reduction in the Z-linked substitution rate, termed Zr (equation 4).
Using the double-bootstrapping method, the median value of Zr is 4.89%, with a large bootstrap standard error of 8.50%. So in this approach, there no evidence of a significant discrepancy (P = 0.286; one-tailed probability of Zr 0) from the male mutation bias predictions. Hence, with the present data set, the male-mutation bias is sufficient to explain observed differences in divergence of autosomal, Z-linked, and W-linked sequences in the chicken–turkey comparison.
It is noteworthy that the m estimates from the three possible comparisons remain consistent despite the significant differences in GC content between autosomes and sex chromosomes: autosomes = 46.7%, Z chromosome = 36.3%, and W chromosome = 33.1% (t-tests: W-A, P << 0.0001; Z-A, P << 0.0001; Z-W, P = 0.088). This finding suggests that GC content does not have a strong effect on mean divergence of chromosomal classes, at least not in the sense that it would affect m estimates.
The observed divergences can be used for estimation of absolute substitution rates between chicken and turkey. Using a divergence time of 28 MYA (Dimcheff, Drovetski, and Mindell 2002), we obtain rates of A = 3.6 x 10–9, Z = 3.9 x 10–9 and W = 2.0 x 10–9 substitutions per site per year. These rate estimates may be useful for dating events in bird evolution based on nuclear sequence data, although substitution rates may, of course, vary among avian lineages. In mammals, substitution rates in the human and mouse lineages since the split of primates and rodents have been estimated at 2.2 x 10–9 and 4.5 x 10–9, respectively (Waterston et al. 2002). Note that these latter substitution rates represent the average rates since the time of divergence, and that current rates may differ even more as the difference in generation time between human and most rodents should be more significant now than shortly after divergence (assuming a generation time effect on substitution rates).
It is interesting to note that the substitution rate estimates for the chicken–turkey comparison are very similar to that obtained from restriction site analysis of several galliform species (Helm-Bychowski and Wilson 1986). These authors mapped 161 restriction sites from three autosomal regions and associated estimates of divergence with fossil evidence. Using different calibration points in galliform evolution, they arrived at rate estimates of 3.4 x 10–9 to 4.0 x 10–9 substitutions per site per year. Apparently, the relative early molecular evolutionary work of Helm-Bychowski and Wilson (1986) is in good agreement with data from large-scale DNA sequencing.
Discussion
In this study, we addressed between-chromosome variation in mutation rates in birds. Our approach was to use variation in intronic divergence to infer mutation-rate variation directly. How well justified is this inference? The most likely complication is the effect of selection on intronic sequences in birds. Little is yet known on this subject, but given that comparative studies are uncovering numerous nongenic conserved regions in mammalian genomes (e.g., see Dermitzakis et al. [2002]) and that bird genomes are smaller than mammalian genomes, it seems possible that conserved regulatory sequences may be relatively common in avian genomes. However, given that a large number of introns were analyzed for each chromosome category and that there may by regional mutation-rate variation or local effects of selection, the present data set may better reflect the mean substitution rate of chromosome category than a similar amount of sequence derived from a single region for each category.
The Male Mutation Bias
Our point estimates of m for the chicken–turkey comparison are 1.71, 2.37, or 2.52, depending on whether based on Z/A, Z/W, or A/W comparison. Previous studies of male-biased mutation in birds have revealed m estimates in the range of 1.7 to 6.5 (Ellegren and Fridolfsson 1997, Kahn and Quin 1999, Carmichael et al. 2000, Fridolfsson and Ellegren 2000, Bartosch-H?rlid et al. 2003), so our point estimates are in the lower range of reported values (table 4). However, most of the estimates are associated with large confidence intervals, and the confidence intervals obtained in this study (0.62 to 7.16, 1.18 to 2.97, and 1.88 to 3.34) generally overlap with those of previous studies.
Table 4 Estimates of m in Birds.
If we assume a single true value of m, then we can estimate it by combining data from autosomes, the Z chromosome, and the W chromosome. For each intron, the observed number of substitutions [O(S)] is given by the ungapped length of the alignment multiplied by the divergence. Note that although we refer to these data as "observed," strictly speaking they are inferred from the alignments, but here we assume perfect inference of divergence. Then assuming a single m, we can calculate the expected number of substitutions, E(S), for each intron as the product of ungapped alignment length, a chromosome-specific scaling factor, and a normalizing factor that ensures the sum of expected substitutions equals the sum of observed substitutions. The chromosome-specific scaling factors (K) reflect the proportion of time spent in the male and female germlines: KW = 1, KA = (1+ m)/2, and KZ = (1+2m)/3. The log-likelihood of m can be approximated by the G-test statistic (page 692 in Sokal and Rohlf [1995]), which uses the O(S) and E(S) values to generate a maximum-likelihood estimate of m = 2.47. Approximate 95% confidence intervals can be estimated as the range of m for which the log-likelihood is within 2 units of the maximum, which yields a range of 2.27 to 2.68.
Because the amount of sequence data in the present study exceeds that of earlier avian studies by one or two orders of magnitude, and the data are based on autosomes as well as sex chromosomes, our maximum-likelihood m estimate may be viewed as the most accurate estimate yet obtained for birds. This contention is substantiated by the fact that our estimate is based upon sequence data from a large number of regions from each chromosome category. Regional mutation-rate variation would make m estimates sensitive to the particular regions used for molecular evolutionary analysis. Previous studies of male-biased mutation in birds (e.g., Ellegren and Fridolfsson [1997], Kahn and Quinn [1999], Carmichael et al. [2000], and Bartosch-H?rlid et al. [2003]), as well as many studies in other organisms (e.g., Shimmin, Chang, and Li [1993], Bohossian, Skaletsky, and Page [2000], and Makova and Li [2002]), have been based on one or a few genomic regions only.
Given the overlap in confidence intervals between the present m estimates in the chicken–turkey comparison and those obtained in studies of other bird species, it would be premature to conclude that the male mutation bias is lower in galliforms than in other birds. On the other hand, a low-point estimate of 1.8 was independently obtained for chicken and turkey using ATP5A1Z/ATP5A1W intron sequences (Carmichael et al. 2000), and the same estimate was obtained in a three-species comparison that included one anseriform and two other galliform species using CHD1Z/CHD1W introns (Bartosch-H?rlid et al. 2003). We have recently found evidence of m being higher in avian lineages with longer generation time and with higher intensity of sexual selection, which suggests a link between life-history characteristics and the male mutation bias (Bartosch-H?rlid et al. 2003). Most galliforms breed at the age of 1 year, so a rather weak male mutation bias would be consistent with a generation time effect.
A potential problem in estimation of the male mutation bias from sex-linked sequences is the effect of ancestral polymorphism, which can bias estimates of m when distances are low and lineage sorting is incomplete (Makova and Li 2002; Ellegren 2002a). However, the effect of ancestral polymorphism in our study is expected to be minimal because all pairwise distances between chicken and turkey are relatively high (5% to 11%), and with a divergence time of 28 MYA, lineage sorting should have been completed long ago. Makova and Li (2002) found ancient polymorphism to affect estimates of m in the human–chimpanzee comparison (1% divergence) but not in comparisons of human and more distantly related primates.
Is There a Reduction in the Z Chromosome Mutation Rate?
Our results indicated that in birds, the Z-linked introns evolve slightly faster than autosomal introns, which, in turn, evolve much faster than W-linked introns. These qualitative findings are in keeping with the male mutation bias predictions of Miyata et al. (1987), and there is, thus, no apparent need to invoke factors additional to the male mutation bias to explain variation in divergence among chromosomal classes. Alternatively, we can view our results as indicating that such potential factors, if they exist, must be weak, which is opposite to some previous suggestions (McVean and Hurst 1997).
There is the theoretical possibility that the increased efficacy of selection against slightly deleterious mutations on the avian Z chromosome relative to the autosomes (Charlesworth, Coyne, and Barton 1987) could reduce the mutation rate on the Z chromosome relative to null expectations. However, a quantitative analysis provides confidence that weak selection is unlikely to be responsible, for example, for a 5% reduction (the estimated value of Zr) in the Z chromosome mutation rate. Irrespective of the nature of dosage compensation in birds (see Ellegren [2002b]), the expected substitution rate of autosomal relative to Z-linked sequences, RA/Z, is given by (see equations 8a and 9 in Charlesworth, Coyne, and Barton [1987]):
Even if all slightly deleterious mutations are recessive (McVean and Charlesworth 1999), a conservative assumption with respect to the strength of selection required, then equation 5 shows that a 5% reduction in Z-linked divergence requires the magnitude of Ns, the product of the selective coefficient of mutations and the effective population size, to be unrealistically high.
We conclude that some theoretical arguments as well as our empirical observations do not support a significantly reduced Z-chromosome mutation rate. On the other hand, failure to demonstrate an effect does not mean that it does not exist. Our data set may simply have been too small to allow detection of a modest reduction in Z rate. Additional data are, thus, needed to firmly settle the question and should be accompanied by a power analysis to reveal what minimum reduction in the Z-chromosome mutation rate would be detectable with the data. Such an analysis would require a deeper insight into the patterns and causes of substitution rate heterogeneity among introns or other chromosomal regions. However, as a preliminary analysis of the power of our data set to determine the maximum value of a putative Z reduction, we took the 95% percentile of the 1,000 double-bootstrap estimates of Zr. This method indicated that Zr is significantly less than 18%. We note that this value is considerably less than the point estimate of Zr = 30% in rodents obtained from the X and A substitution data of McVean and Hurst (1997) combined with the assumption of a male-to-female mutation bias of 2 in rodents (Chang et al. 1994).
Why Not a Reduced W-Chromosome Mutation Rate?
It should be noted that we may as well had predicted divergence on the W chromosome, given the observed autosomal and Z-linked rates. Similarly, in theory, we could have inferred the reduction in W-linked substitution rates, Wr, by comparison of expected and observed divergences (table 4). Using the present data set, the outcome of such an analysis would have been the same as the analysis of a possible reduction on Z; all comparisons give the same one-tailed probability (P = 0.286) and, thus, provide no statistical support for deviations from expectations from the male mutation bias. However, the question of which approach is the correct one is still warranted from a general perspective: Shall we rely on the observed W rate to calculate the expected Z rate (via comparison with autosomes), or shall we use the observed Z rate to predict the rate on W? Put in other words, should potential discrepancies between the three ways of estimating m be interpreted as a reduction in the substitution rate on the Z chromosome or as a reduction in the substitution rate on the W chromosome (or as some combination of the two factors)? In fact, the corresponding question applies to studies of mammalian sex chromosomes: Shall the observed Y rate be used to predict the rate on X, or shall the observed X rate be used to calculated the expected rate on Y? Previous work in this field has ignored the latter possibility and only addressed the possible reduction in X-chromosome mutation rate (McVean and Hurst 1997; Nachman and Crowell 2000; Ebersberger et al. 2002; Malcom, Wyckoff, and Lahn 2003).
There is no way to differentiate between these possibilities with present data, because three chromosome categories means only two degrees of freedom, and one of those is used to estimate the male mutation bias. If we had some way of knowing the "true" male mutation bias, we could use the extra degree of freedom to differentiate between a W reduction and a Z reduction by seeing which of the Z/A and A/W comparisons gave the m estimate closest to the true value. However, evolutionary theory may help to resolve the issue. The argument for a reduction in the Z chromosome mutation rate (McVean and Hurst 1997) relies on cost-benefit considerations of adaptive mutation rates. Given that the Z chromosome contains many more genes than the tiny W chromosome (probably by at least two orders of magnitude [Ellegren 2000; Schmid et al. 2000]), the strength of selection for a reduced mutation rate should be expected to be much higher on Z than on W. Moreover, note that the benefit of a reduced mutation rate is the avoidance of deleterious mutations being exposed in a hemizygote chromosome. Although W is always hemizygous, a majority of the avian W-linked genes so far characterized have highly similar, and likely functionally equivalent, homologs on Z (Ellegren 2002b). Recessive mutations in these W-linked genes are likely to be masked by the gametologous Z-linked genes, thereby giving little benefit of mutation-rate reduction.
The Double-Bootstrapping Method
Estimating the confidence intervals of divergence measures by bootstrapping individual sites from concatenated data sets is commonly applied in molecular evolutionary studies. However, the increasing support for regional mutation-rate variation within genomes (reviewed in Ellegren,. Smith, and Webster [2003]), which includes our preliminary data for birds, necessitated an adjustment in the method for estimating the confidence intervals of chromosomal class divergences. Simply concatenating all alignments and then bootstrapping by sampling with replacement from all sites is not sufficient, because it does not account for the variation generated by the choice of a limited number of introns. To account for regional variation, we developed the double-bootstrapping method, which bootstraps by introns and by sites within intronic alignments and then takes the unweighted mean of the bootstrapped alignments. With this method, we found no statistical support for a specific reduction of the Z-chromosome mutation rate. Had we only bootstrapped by sites, slightly different divergence medians and significantly different standard errors would have been obtained (A = 10.27% ± 0.28%, Z = 10.89% ± 0.28%, and W = 5.66% ± 0.26%). Note that the much lower standard errors compared with the double-bootstrapping procedure is a consequence of not taking rate variation among introns into account. Importantly, with this approach, there would have been significant support for a reduction in the Z-chromosome mutation rate (Zr = 7.88%, P = 0.026). We believe this observation is one of general significance and that it calls for careful statistical treatment of molecular evolutionary data sets in the presence of underlying rate heterogeneity.
Acknowledgements
Financial support was obtained from the Swedish Research Council. H.E. is a Royal Academy of Sciences Research Fellow supported by the Knut and Alice Wallenberg Foundation. We thank Scott Edwards and two anonymous reviewers for useful comments.
Literature Cited
Arndt, P. F., D. A. Petrov, and T. Hwa. 2003. Distinct changes of genomic biases in nucleotide substitution at the time of mammalian radiation. Mol. Biol. Evol. 20:1887-1896.
Bartosch-H?rlid, A., S. Berlin, N. G. C. Smith, A. P. M?ller, and H. Ellegren. 2003. Life history and the male mutation bias. Evolution 57:2398-2406.
Bohossian, H. B., H. Skaletsky, and D. C. Page. 2000. Unexpectedly similar rates of nucleotide substitution found in male and female hominids. Nature 406:622-625.
Carmichael, A. N., A. K. Fridolfsson, J. Halverson, and H. Ellegren. 2000. Male-biased mutation rates revealed from Z and W chromosome-linked ATP synthase alpha-subunit (ATP5A1) sequences in birds. J. Mol. Evol. 50:443-447.
Chang, B. H. J., L. C. Shimmin, S. K. Shyue, D. Hewett-Emmett, and W. H. Li. 1994. Weak male-driven evolution in rodents. Proc. Natl. Acad. Sci. USA 91:827-831.
Charlesworth, B., J. A. Coyne, and N. H. Barton. 1987. The relative rates of evolution of sex chromosomes and autosomes. Am. Nat. 130:113-146.
Dermitzakis, E. T., A. Reymond, and R. Lyle, et al. (11 co-authors). 2002. Numerous potentially functional but non-genic conserved sequences on human chromosome 21. Nature 420:578-582.
Dimcheff, D. E., S. V. Drovetski, and D. P. Mindell. 2002. Phylogeny of Tetraoninae and other galliform birds using mitochondrial 12S and ND2 genes. Mol. Phyl. Evol. 24:203-215.
Ebersberger, I., D. Metzler, C. Schwarz, and S. Paabo. 2002. Genomewide comparison of DNA sequences between humans and chimpanzees. Am. J. Hum. Genet. 70:1490-1497.
Ellegren, H. 2000. Evolution of the avian sex chromosomes and their role in sex determination. Trends Ecol. Evol. 15:188-192.
Ellegren, H. 2002a. Human mutation: blame (mostly) men. Nat. Genet. 31:9-10.
Ellegren, H. 2002b. Dosage compensation: do birds do it as well? Trends Genet. 18:25-28.
Ellegren, H., and A. K. Fridolfsson. 1997. Male-driven evolution of DNA sequences in birds. Nat. Genet. 17:182-184.
Ellegren, H., N. G. C. Smith, and M. T. Webster. 2003. Mutation rate variation in the mammalian genome. Curr. Opin. Genet. Dev. 13:562-568.
Eyre-Walker, A., and L. D. Hurst. 2001. The evolution of isochores. Nat. Rev. Genet. 2:549-555.
Fridolfsson, A. K., H. Cheng, and N. G. Copeland, et al. (10 co-authors). 1998. Evolution of the avian sex chromosomes from an ancestral pair of autosomes. Proc. Natl. Acad. Sci. USA 95:8147-8152.
Fridolfsson, A. K., and H. Ellegren. 2000. Molecular evolution of the avian CHD1 genes on the Z and W sex chromosomes. Genetics 155:1903-1912.
Hardison, R. C., K. M. Roskin, and S. Yang, et al. (18 co-authors). 2003. Covariation in frequencies of substitution, deletion, transposition, and recombination during eutherian evolution. Genome Res. 13:13-26.
Hellmann, I., I. Ebersberger, S. E. Ptak, S. P??bo, and M. Przeworski. 2003. A neutral explanation for the correlation of diversity with recombination rates in humans. Am. J. Hum. Genet. 72:1527-1435.
Helm-Bychowski, K. M., and A. C. Wilson. 1986. Rates of nuclear DNA evolution in pheasant-like birds: evidence from restriction maps. Proc. Natl. Acad. Sci. USA 83:688-692.
Hoelzel, A. R., and A. Green. 1998. PCR protocols and population analysis by direct DNA sequencing and PCR-based DNA fingerprinting. Pp. 201–233 in A. R. Hoelzel, ed. Molecular genetic analysis of populations: a practical approach. Oxford University Press, Oxford, UK.
Kahn, N. W., and T. W. Quinn. 1999. Male-driven evolution among Eoaves? A test of the replicative division hypothesis in a heterogametic female (ZW) system. J. Mol. Evol. 49:750-759.
Lawson, L. J., and G. M. Hewitt. 2002. Comparison of substitution rates in ZFX and ZFY introns of sheep and goat related species supports the hypothesis of male-biased mutation rates. J. Mol. Evol. 54:54-61.
Lercher, M. J., and L. D. Hurst. 2002. Human SNP variability and mutation rate are higher in regions of high recombination. Trends Genet. 18:337-340.
Lercher, M. J., E. J. B. Williams, and L. D. Hurst. 2001. Local similarity in evolutionary rates extends over whole chromosomes in human-rodent and mouse-rat comparisons: implications for understanding the mechanistic basis of the male mutation bias. Mol. Biol. Evol. 18:2032-2039.
Li, W. H., S. J. Yi, and K. Makova. 2002. Male-driven evolution. Current Opin. Genet. Dev. 12:650-656.
Makova, K. D., and W. H. Li. 2002. Strong male-driven evolution of DNA sequences in humans and apes. Nature 416:624-626.
Malcom, C. M., G. J. Wyckoff, and B. T. Lahn. 2003. Genic mutation rates in mammals: local similarity, chromosomal heterogeneity, and X-versus-autosome disparity. Mol. Biol. Evol. 20:1633-1641.
Matassi, G., P. M. Sharp, and C. Gautier. 1999. Chromosomal location effects on gene sequence evolution in mammals. Curr. Biol. 9:786-791.
McVean, G. A. T., and B. Charlesworth. 1999. A population genetic model for the evolution of synonymous codon usage: patterns and predictions. Genet. Res. 74:145.
McVean, G. T., and L. D. Hurst. 1997. Evidence for a selectively favourable reduction in the mutation rate of the X chromosome. Nature 386:388-392.
Miyata, T., H. Hayashida, K. Kuma, K. Mitsuyasu, and T. Yasunaga. 1987. Male-driven molecular evolution: a model and nucleotide sequence analysis. Cold Spring Harbor Symp. Quant. Biol. 52:863-867.
Nachman, M. W., and S. L. Crowell. 2000. Estimate of the mutation rate per nucleotide in humans. Genetics 156:297-304.
Nanda, I., Z. Shan, and M. Schartl, et al. (15 co-authors). 1999. 300 million years of conserved synteny between chicken Z and human chromosome 9. Nat. Genet. 21:258-259.
Nanda, I., T. Haaf, M. Schartl, M. Schmid, and D. W. Burt. 2002. Comparative mapping of Z-orthologous genes in vertebrates: implications for the evolution of avian sex chromosomes. Cytogenet. Genome Res. 99:178-184.
Schmid, M., I. Nanda, and M. Guttenbach, et al. (34 co-authors). 2000. First report on chicken genes and chromosomes 2000. Cytogenet. Cell Genet. 90:169-218.
Shimmin, L. C., B. H. J. Chang, and W. H. Li. 1993. Male-driven evolution of DNA sequences. Nature 362:745-747.
Silva, J. C., and A. S. Kondrashov. 2002. Patterns in spontaneous mutation revealed by human-baboon sequence comparison. Trends Genet. 18:544-547.
Smith, N. G. C., M. T. Webster, and H. Ellegren. 2002. Deterministic mutation rate variation in the human genome. Genome Res. 12:1350-1356.
Smith, N. G. C., M. T. Webster, and H. Ellegren. 2003. A low rate of simultaneous double-nucleotide mutations in primates. Mol. Biol. Evol. 20:47-53.
Sniegowski, P. D., P. J. Gerrish, T. Johnson, and A. Shaver. 2000. The evolution of mutation rates: separating causes from consequences. Bioessays 22:1057-1066.
Sokal, R. R., and F. J. Rohlf. 1995. Biometry. W.H. Freeman, New York.
Tamura, K., and M. Nei. 1993. Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol. Biol. Evol. 10:512-526.
Thompson, J. D., D. G. Higgins, and T. J. Gibson. 1994. ClustalW—improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22:4673-4680.
Waterston, R. H., K. Lindblad-Toh, and E. Birney, et al. (222 co-authors). 2002. Initial sequencing and comparative analysis of the mouse genome. Nature 420:520-562.
Williams, E. J. B., and L. D. Hurst. 2000. The proteins of linked genes evolve at similar rates. Nature 407:900-903.
Wolfe, K. H., P. M. Sharp, and W. H. Li. 1989. Mutation rates differ among regions of the mammalian genome. Nature 337:283-285.
Yang, Z. 1997. PAML: a program package for phylogenetic analysis by maximum likelihood. Comput. Appl. Biosci. 13:555-556.
Zhao, Z., and E. Boerwinkle. 2002. Neighboring-nucleotide effects on single nucleotide polymorphisms: a study of 2.6 million polymorphisms across the human genome. Genome Res. 12:1679-1686.(Erik Axelsson, Nick G.C. )