当前位置: 首页 > 期刊 > 《分子生物学进展》 > 2003年第1期 > 正文
编号:10582178
Positive Selection Within Sperm-Egg Adhesion Domains of Fertilin: An ADAM Gene with a Potential Role in Fertilization
http://www.100md.com 《分子生物学进展》2003年第1期
     Department of Biology, University of Winnipeg, Winnipeg, MB, Canada@0/$.6, 百拇医药

    Abstract@0/$.6, 百拇医药

    Genes with a role in fertilization show a common pattern of rapid evolution. The role played by positive selection versus lack of selective constraints has been more difficult to establish. One problem arises from attempts to detect selection in an overall gene sequence analysis. I have analyzed the pattern of molecular evolution of fertilin, a gene coding for a heterodimeric sperm protein belonging to the ADAM (A disintegrin and A metalloprotease) gene family. A nonsynonymous to synonymous rate ratio (dN/dS) analysis for different protein domains of fertilin {alpha} and fertilin ß showed dN/dS < 1, suggesting that purifying selection has shaped fertilin's evolution. However, an analysis of the distribution of single positively selected codon sites using phylogentic analysis by maximum likelihood (PAML) showed sites within adhesion domains (disintegrin and cysteine-rich) of fertilin ß evolving under positive selection. The region 3' to the EGF-like domain of fertilin {alpha} , where the transmembrane and cytoplasmic tail regions are supposed to be localized, showed higher dN and dS than any other fertilin {alpha} region. However, it was not possible to identify positively selected codon sites due to ambiguous alignments of the carboxy-end region (ClustalX vs. DiAlign2). When this region was excluded from the PAML analysis, most single positively selected codon sites were concentrated within adhesion domains (cysteine-rich and EGF-like). The use of an ancestral sequence prior to a recent duplication event of fertilin {alpha} among non-Hominidae primates (Macaca, Papio, and Saguinus) revealed that the duplication is partially responsible for masking the detection of positively selected sites within the disintegrin domain. Finally, most ADAM genes with a potential role in sperm maturation and/or fertilization showed significantly higher dN estimates than other ADAM genes.

    Key Words: fertilin • gene duplication • fertilization • selection • mammalsg+es, 百拇医药

    Introductiong+es, 百拇医药

    Analysis of the evolution of single genes with a sex-related function shows a common pattern of rapid evolution in a wide variety of taxa . The most striking examples come from marine invertebrates, where sperm surface genes involved in fertilization show a high proportion of nonsynonymous to synonymous replacements, indicating the role played by positive selection in shaping the evolution of such genes . Such a pattern is not unique to genes of fertilization in external fertilizers. In Drosophila, positive selection has been shown for accessory gland protein products and a gene identified as potentially responsible for hybrid male sterility .g+es, 百拇医药

    In mammals, genes involved in sex determination such as Sry have shown patterns of rapid evolution that were first attributed to positive selection . More recently, it has been suggested that Sry rapid evolution is due to lack of selective constraints or episodic selection between species that have diverged for a long period of time . Positive selection has been proposed based on the low intraspecific polymorphism and high divergence of a gene coding for a protein secreted in the mice saliva and used as a pheromonal signal . The analysis of sequence evolution at DAZ, a candidate gene for male infertility, has recently shown signs of positive selection in primates . Despite the large number of examples, little is known about the role of selection on the evolution of male and female reproductive genes directly involved in fertilization reactions in mammals. An exception comes from the work by Swanson and collaborators , who have detected the role of selection shaping the rapid evolution of female reproductive proteins expressed in the female's egg zona pellucida.

    Mammalian fertilization is a complex process that proceeds through a series of steps involving recognition and binding of the sperm to the zona pellucida, the ability of the sperm to cross the barriers imposed by the egg, and sperm fusing to the egg membrane . Prior to sperm-egg interaction, sperm must undergo a series of modifications while passing through the epididymis, followed by capacitation of the sperm membrane in the female reproductive tract. Members of the ADAM (A Disintegrin and A Metalloprotease) protein family are among a number of candidates that might serve as binding partners for the egg-membrane surface proteins or as epididymal secretory proteins that interact with spermatozoa .rv1+(@, 百拇医药

    Fertilin, an ADAM sperm protein involved in the sperm-egg plasma membrane interaction, has been better characterized than other mammalian sperm proteins. Fertilin is a heterodimeric glycoprotein that was first identified in guinea pig using monoclonal antibodies to sperm surface antigens that could inhibit sperm-egg fusion . The protein is composed of an alpha and beta subunit with similar domain structures and is proteolytically processed during sperm development by removal of the prodomain and metalloprotease domain . Processing of fertilin is crucial for exposing the disintegrin domain that mediates sperm-egg binding and for allowing proper localization of fertilin in the head of mature sperm . Although the prodomain and the metalloprotease domain of some ADAM proteins act to block or to promote protease activity, these domains have no such role in fertilin .

    fig.ommitted+fm, http://www.100md.com

    FIG. 1. Diagrams of domain structure of fertilin pro- and fertilin pro-ß proteins and the proteolytic events leading to mature fertilin and fertilin ß present in sperm. Numbers under the domains refer to the amino acid position defining their limit in the GeneBank Cavia cobaya sequence entries (above) and positions defined using the NCBI conserved domain search service (below). Domains identified using the NCBI service are not consecutive and gaps between them are represented by a dash. Domains are Pro = prodomain, Met = metalloprotease, Dis = disintegrin, Cys = cysteine-rich, EGF = elongation growth factor-like, Tm = transmembrane and Cyt = cytoplasmic tail+fm, http://www.100md.com

    The disintegrin domain of fertilin has been the most widely studied in terms of its function. Its role in sperm-egg binding and blocking subsequent sperm fusion with the egg has been proposed from studies that tested the ability of synthetic peptides to inhibit sperm-egg binding in in-vitro systems , use of recombinant fertilin and ß that included or lacked the disintegrin domain and the analysis of knockout mutations within the disintegrin domain of fertilin ß . Recombinant forms of fertilin {alpha} can inhibit sperm-egg binding if they only have the disintegrin domain or the cysteine-rich and EGF-like domains, suggesting that not only the disintegrin domain has adhesion activity . The functional role of the transmembrane domain and the cytoplasmic tail domain of fertilin is not known. However, there is evidence that fertilin localizes in the head of mature sperm by lateral diffusion across the membrane and that such localization is species specific .

    Given our current knowledge on the processing of fertilin during sperm maturation and its potential role in sperm-egg interaction and fertilization I have analyzed the pattern of molecular evolution of both fertilin {alpha} and ß. This article considers whether the evolution of this protein, given its potential role in sperm maturation and fertilization, has been shaped by positive selection. I then asked whether the distribution of positively selected sites among protein regions is homogeneous as opposed to concentrated within protein regions having a role in sperm-egg interaction and/or sperm maturation.:hit%, 百拇医药

    Materials and Methods:hit%, 百拇医药

    Sequence Data:hit%, 百拇医药

    Both nucleotide and amino acid sequences of fertilin were collected from GenBank for the following species with the accession numbers provided in parentheses: Fertilin from Bos taurus (AF086807), Oryctolagus cuniculus (U46069), Macaca fascicularis fertilin I (X79808), M. fascicularis fertilin II (X79809), Ratus novergicus (Y08616), Pongo pygmaeus (Y15491), Saguinus oedipus fertilin I (Y15511), S. oedipus fertilin II (Y15512), Papio hamadryas anubis fertilin I (Y15519), P. hamadryas anubis fertilin {alpha} II (Y15520), Cavia cobaya (Z11719), and Mus musculus (U22056, AF167406). The following fertilin ß sequences were also retrieved from GenBank for sequence analysis: Bos taurus , Macaca fascicularis , Homo sapiens , Mus musculus , Oryctolagus cuniculus , Ratus norvegicus , and Cavia cobaya Amino acid sequences were aligned using the global alignment algorithm ClustalX and the local alignment algorithm DiAlign2 ). Amino acid alignments were used to generate nucleotide alignments. Domains were identified within sequences by following the Cavia cobaya domain assignments available from the GenBank sequence entry (Accession numbers: and ) and by comparing amino acid sequences to domains derived from two collections, Smart and Pfam, using the NCBI conserved domain search service . In assigning domains, I used the amino acid positions defined by the Cavia cobaya GenBank entries. However, the prodomain and the transmembrane and cytoplasmic tail domains were broadly kept in the analysis as amino-end and carboxy-end, respectively, because they were not identified when comparing sequences to domain databases .

    Sequence Analysis: Testing Selection|cxr, http://www.100md.com

    Positive selection can be inferred from a higher proportion of nonsynonymous than synonymous substitutions per site (dN/dS>1). dN and dS were calculated using the modified Nei-Gojobori Jukes-Cantor method which considers deviations from an equal frequency of transitions (ts) and transversion (tv) substitutions pp. 57–60). The MEGA2 software was used to calculate the ts/tv ratio (R), and R was used as input in the calculation of dN and dS. Estimates were obtained for the entire aligned sequences as well as for defined domains within sequences.|cxr, http://www.100md.com

    To detect specific amino acid sites under positive selection among sites potentially experiencing variable selective pressures, I used the likelihood ratio test . Using the codemlsites program of PAML , in which the unit of evolution is the codon, I obtained log likelihood estimates of a tree topology under models that impose alternative assumptions in terms of rate variation ({omega}

     = dN/dS) over codon sites. Model 0 (M0) assumes constant {omega}2, http://www.100md.com

    ratio across codon sites, whereas model 3 (M3) assumes different proportions (pi) of discrete classes of sites with different {omega}2, http://www.100md.com

    i ratios. Twice the log likelihood difference (2) between M3 and M0 estimates provides a test for the existence of rate variation over codon sites . Another test compares the log likelihood of a tree under model 7 (M7), which assumes a distribution of {omega}2, http://www.100md.com

    values constrained between 0 and 1 (no positive selection) and model 8 (M8) that adds a class of sites with {omega}2, http://www.100md.com

    ratios > 1.0 estimated from the data. Comparing {ell} between these two models depicts the existence of amino acid sites under positive selection . If the log likelihood test suggests the presence of sites under positive selection, then these sites can be identified by using a Bayesian method to estimate posterior probabilities (P) that particular sites are likely to come from a class with {omega}

    > 1.0. I used P({omega}^/o$#{, http://www.100md.com

    >1)
> 0.95 as the lower threshold to identify sites under positive selection.^/o$#{, http://www.100md.com

    Sequence Analysis: Gene Duplication^/o$#{, http://www.100md.com

    In some species, fertilin has undergone gene duplication, and so it is possible that any signal or lack of signal of positive selection could be a consequence of differentiation between paralogs. I used the baseml program of PAML to reconstruct sequences ancestral to the duplication event under models that assume rate variation over nucleotide sites and different patterns of nucleotide substitutions. All models used here assume nucleotide substitution rates drawn from a gamma distribution . The more complex model (HKY85) assumes different equilibrium frequencies for the four nucleotides (i) and different transition/transversion rate ratios . A simpler model (F81) is one assuming no differences in transition/ transversion rate ratios ( = 1) or a model (K80) assuming equal equilibrium frequencies for all nucleotides (i = 1). The simplest model is one in which nucleotide frequencies are equal and there is no difference in transition/ transversion rate ratios (JC69).

    The log likelihood of the species tree topology for fertilin was calculated under the different models of nucleotide substitutions, and differences in log likelihood (2) among models were compared to a 2 distribution with degrees of freedom given by the difference in number of parameters estimated for each model. The model that better explained the data was used to reconstruct a sequence ancestral to the fertilin I-fertilin II duplication event. The ancestral sequence was used to test episodes of positive selection at specific codon sites prior to the duplication event.o^we?'k, 百拇医药

    Comparisons Among ADAM Gene Family Memberso^we?'k, 百拇医药

    I have used the modified Nei-Gojobori Jukes Cantor method available in the MEGA2 software package to compare dN and dS estimates among 14 different ADAM genes whose sequences are available from mouse and human and for which we have information on their tissue of expression. Estimates were obtained for the metalloprotease and the disintegrin domains identified using the NCBI conserved domain search service . The accession numbers of the gene sequences used are shown in parentheses (Mus musculus; Homo sapiens): Adam 7 (AF013107; AF215824), Adam 8 (NM_007403; NM_001109), Adam 9 (NM_007404; NM_003816), Adam 10 (AF011379; XM_007741), Adam 11 (NM_009613; AB009675), Adam 12 (D50411; AF023477), Adam 15 (AB022089; NM_003815), Adam 19 (D50410; AF311317), Adam 21 (NM_020330; AF158644), Adam 23 (NM_011780; NM_003812), Adam 28 (AF153350; NM_021777), fertilin ß (U38806; U38805), Adam 18/ tmdc III (AF167405; AJ133004), TNF{alpha} (U69613; U69611).

    To test for significant differences in dN estimates between genes, I used a t-test with infinite degrees of freedom. The test is similar to that used to detect differences between dN and dS (, p. 55) and consists of calculating the differences in the proportion of substitutions per site and weighting it by the standard errors of the estimates. The formula for the test statistics is: Zij = (dNi - dNj) / ( + )1/2, where dNi and si are the proportion of nonsynonymous substitutions per nonsynonymous site for gene i and its standard error, while dNj and sj are the same estimates for gene j. The same test was used for differences in dS estimates between genes.q\, http://www.100md.com

    Results and Discussionq\, http://www.100md.com

    Different Selective Pressures Within Domains of Fertilin and ßq\, http://www.100md.com

    The analysis of the average proportions of nonsynonymous and synonymous substitutions shows that both fertilin and ß have a dN/dS ratio lower than one, suggesting that these genes have evolved under purifying selection.

    fig.ommitted6wv3&, http://www.100md.com

    Table 1 Average dn/ds Ratio and Proportions of Nonsynonymous (dN) and Synonymous (ds) Substitutions per Site Among Species for Different Domains of Fertilin6wv3&, http://www.100md.com

    If the ratio of synonymous substitutions is not uniform across gene regions or domains, then high dN/dS ratio at a specific gene region or domain could result from constraints in the proportion of synonymous substitutions and low dN/dS ratio from an elevated rate of synonymous substitutions. The carboxy-end region of fertilin {alpha} has high proportions of nonsynonymous (dN) as well as synonymous (dS) substitutions compared to other domains and so the high dN/dS ratio seems not to be caused by constraints on the proportion of synonymous substitutions. Both the metalloprotease and disintegrin domains show the lowest dN/dS ratios among domains, but while the metalloprotease domain has the lowest dN, the disintegrin domain shows a dS estimate higher than all other domains except the carboxy-end region .

    The pattern of substitutions for fertilin ß shows the disintegrin, cysteine-rich, and EGF-like domains having a higher proportions of synonymous substitutions than other domains. The proportion of synonymous substitutions is particularly high for the disintegrin domain, making the dN/dS ratio the lowest of all. The proportion of nonsynonymous substitutions is also higher for the disintegrin and cysteine-rich domain than others, with the EGF-like domain showing a dN estimate similar to other fertilin ß domains .k\3{if, 百拇医药

    The question remains whether it is possible that the elevated proportions of synonymous changes at domains such as the disintegrin domain of both fertilin and ß might mask the detection of nonsynonymous changes that could have accumulated due to adaptive divergence between species.k\3{if, 百拇医药

    Testing for Codon Changes Driven by Positive Selectionk\3{if, 百拇医药

    Although the overall dN/dS estimate does not indicate a signal of positive selection, it is possible that particular sites within a coding sequence might be under positive selection. Because the assignment of domains based on the Cavia cobaya protein sequence might be considered arbitrary, I have used a site testing approach to search for codon sites under selection. This approach offers the advantage that searches of positively selected sites can be done without a priori information about domains. Once sites are identified, they can be located within a known domain or within unidentified protein regions . It is also possible that if only specific sites within a domain have evolvedunder the influence of positive selection, a window analysis based on the entire domain will not detect them.

    The phylogenetic analysis by maximum likelihood (PAML) was used to estimate the likelihood of a phylogeny under models that make alternative assumptions about the dN/dS rate of change among codon sites ({omega}) . The tree topologies used for the analysis are shown in . The analysis of fertilin and fertilin ß was run using ClustalX and Dialign2 alignments. For fertilin , the alignments obtained when using alternative algorithms were not consistent for residues in the carboxy-end region (beyond the EGF-like domain). This creates a situation of uncertainty when trying to determine what codon sites are positively selected. Therefore, I have removed residues beyond the EGF-like domain, making the results obtained using ClustalX and Dialign2 alignments consistent, before performing phylogenetic analysis by maximum likelihood (PAML).ply?, http://www.100md.com

    fig.ommittedply?, http://www.100md.com

    FIG. 2. Neighbor-Joining Poisson corrected phylogenetic trees for fertilin (top) and fertilin ß (bottom) based on amino acid sequence alignments show the topology used in the phylogenetic analysis by maximum likelihood (PAML). The arrow points at a node previous to a recent duplication event in fertilin for which sequence was reconstructed (see Materials and Methods)

    fig.ommittedw]?, 百拇医药

    FIG. 3. Amino acid position in the Cavia cobaya sequence identified as being under positive selection (bolded) using M8 versus M7 PAML comparisons. Underlined amino acid sites were detected by using sequences aligned with ClustalX, while open circles point at positively selected sites detected using sequences aligned with DiAlign2. a, Partial fertilin sequence (sites removed from the alignment are in italics): Positively selected sites are 204 Q (Pro/ Und); 508 V (Dis); 585 Q, 591 T, 611 S, 644 S, and 652 A (Cys); 672 L and 674 T (Cys/ Und); 678 S and 699 D (EGF). b, Full-length fertilin : Positively selected sites are not singled out (see results). c, Fertilin ß: Positively selected sites are 407 Q, 408 D, 423 R, 428 P, 437 T, and 446 T (Dis); 476 N, 490 K, 494 Q, 506 V, 510 E, 527 P, and 583 A (Cys); 645 Q (Und)w]?, 百拇医药

    For fertilin , the estimated {omega}w]?, 百拇医药

    value from M0 was 0.45 and the log likelihood estimate was = -11,643.92. This model has only one parameter estimated from the data and therefore only one degree of freedom . For M3, five independent parameters given by three {omega}

    i values and two pi proportions were estimated from the data. The estimates under this model suggest that 87% of sites in fertilin are under strong or mild purifying selection with {omega}u, 百拇医药

    0 = 0.04 and {omega}u, 百拇医药

    1 = 0.65 ( p0 = 0.44 and p1 = 0.43), whereas 13% of sites have more than a 0.5 probability of being under positive selection with {omega}u, 百拇医药

    2 = 2.34 (. M3 fits the data significantly better than M0, the test statistic is 2 = 2 x (-11,264.07 - (-11,643.92)) = 759.7, compared to a 2 value with df = dfM3 - dfM0 = 4u, 百拇医药

    fig.ommittedu, 百拇医药

    Table 2 Model Parameter Estimates, Degrees of Freedom, Log Likelihood Values and Test Statistics Under Different PAML Models for Fertilin and Fertilin ßu, 百拇医药

    A second refined test for the presence of positively selected sites compares models 7 (M7) and model 8 (M8). M7 assumes a beta distribution [B(p,q)] of {omega}

    values constrained between 0 and 1, and therefore the likelihood of the fertilin {alpha} phylogeny depicted in is estimated under the assumption of no positive selection among sites. The number of parameters estimated for M7 is given by p = 0.23 and q = 0.29 (df = 2), and the log likelihood estimate was {ell} = -11,301.91. M8 also assumes a beta distribution of {omega}allk, 百拇医药

    values with parameters p and q but allows for sites with {omega}allk, 百拇医药

    > 1.0, and therefore the log likelihood estimate for the tree topology considers codon sites under positive selection. The likelihood ratio test comparing M7 and M8 shows that M8 fits the data better than M7 because the test statistic 2{Delta} {ell} = 2 x (-11,267.75 - (-11301.91)) = 78.32 is significantly greater than a {chi} 2 value with df = dfM8 - dfM7 = 2 . The M8 estimates suggest that about 10% sites (p1) have higher than a 0.5 posterior probability (P > 0.5) of being under positive selection with {omega}

    = 2.60 .7*[}gkn, http://www.100md.com

    shows codon sites of fertilin {alpha} inferred to be under positive selection when a more stringent 95% posterior probability threshold (P > 0.95) is applied. The figure also shows whether sites are consistently detected as being under positive selection depending on the alignment used. lists the sites detected as being under positive selection when only a partial fertilin {alpha} sequence is considered. The results show that regardless of the alignment used, the same sites are detected as being under positive selection and most are within the cysteine-rich domain (parameters of PAML model are those shown in ).7*[}gkn, http://www.100md.com

    shows that when the entire fertilin sequence is used, a very high number (25) of positively selected sites are detected in the carboxy-end region, where the transmembrane and cytoplasmic tail domains of this protein are supposed to be localized. However, most sites in this region are detected as positively selected depending on the alignment algorithm used reflecting the ambiguity in the alignment of the carboxy-end region. The only time when differences between the alignment programs are not obvious is when the sequences are very similar to one another. Although the ambiguity of the alignment does not allow one to single out positively selected codon sites at the carboxy-end of fertilin , it is interesting that this region of the protein is so highly divergent. In guinea pig, we know that lateral diffusion of the fertilin protein across the cell membrane of sperm is important during sperm maturation and such movement can lead to species-specific localization of fertilin in the sperm head . Adaptive differentiation between species in their transmembrane and cytoplasmic tail portion of the protein may define a species-specific pattern of cellular interactions; however quality of the sequence available in GenBank might also be an issue.

    The M0 estimated {omega}:, 百拇医药

    value was 0.37 for fertilin ß, and the log likelihood estimate was = -11,259.27 . M3 estimates suggest that most sites in fertilin ß are evolving under strong or mild purifying selection with {omega}:, 百拇医药

    0 = 0.08, {omega}:, 百拇医药

    1 = 0.71 (p0 = 0.52 and p1 = 0.39), whereas approximately 9% of sites have more than a 0.5 probability of being under positive selection with {omega}:, 百拇医药

    2 = 3.75 . M3 fits the data significantly better than M0, the test statistic is 2 = 625, compared to a 2 value with df = 4 The estimated distribution of {omega}:, 百拇医药

    values under M7 has most of the sites having {omega}:, 百拇医药

    values close to either 0 or 1 B(0.34, 0.50), and the log likelihood estimate under this model was = -10,987.35. M8 model fits the data better than M7 (2 = 81.2, significantly greater than a 2 value with df = 2) . The M8 estimates suggest that about 9% of fertilin ß sites (p1) have more than a 0.5 probability of being under positive selection with {omega}

    = 3.60 . shows sites within different domains of fertilin ß inferred to be under positive selection at a 95% posterior probability threshold level (P > 0.95) when alternative alignments are used (ClustalX or Dialign2). Results are consistent regardless of the alignment used with positively selected sites being detected only within the disintegrin and the cysteine-rich domains .#c, http://www.100md.com

    fig.ommitted#c, http://www.100md.com

    Table 3 Log Likelihood and Parameters Estimated for the Fertilin Gene Tree Under Different Nucleotide Substitution Models Tested for Reconstruction of Sequence Ancestral to Fertilin I–Fertilin II Gene Duplication#c, http://www.100md.com

    Gene Duplication and Rapid Evolution#c, http://www.100md.com

    It is possible that the rapid evolution driven by positive selection of some sites within fertilin are solely a consequence of the potential role of the protein in reproduction. However, other mechanisms could drive such rapid divergence. For example, a recent duplication event in Macaca, Saguinus and Papio leading to the origin of fertilin I and fertilin II may have caused subsequent subfunctionalization of the protein duplicates. Subfunctionalization appears to be a likely mechanism to explain the preservation of gene duplicates in mammals and implies a differential loss and retention of functions present in the ancestral gene rather than the acquisition of novel functions. Therefore, subfunctionalization assumes no acquisition of beneficial mutations driving the evolution of duplicates and hypothesizes that degenerative mutations that preserve both duplicates as functional copies allows the acquisition of complementary sets of subfunctions .

    It is possible that part of the positive selection signal detected for fertilin {alpha} could be assigned to acquisition of new functions between paralogs or alternatively that strong preservation of function between recently derived duplicates may be masking an ancestral event of adaptive diversification by positive selection.8p?., 百拇医药

    To determine whether fertilin {alpha} duplicates have recently diversified in function or preserved an ancestral status, I reconstructed ancestral sequences using models that assume different patterns of nucleotide substitutions. Log likelihood estimates for the fertilin tree topology under different models of nucleotide substitution are shown in . Model JC69 appears unacceptable compared to model K80 because the transition/transversion ratio is different from 1 ( = 3.7) (2 = 643.6, df = 1; P < 0.0001) . Similarly, among models that do not assume equal nucleotide frequencies (i), model HKY85 is significantly better than model F81 (2 = 638.6, d.f. = 1; P < 0.0001) Because nucleotide frequencies are very close to equal for the fertilin {alpha} dat set (T = 0.25, C = 0.24, A = 0.25, G = 0.26), model K80 is as realistic as model HKY85 for inferring nucleotide substitutions and ancestral sequences (their log likelihood estimates are almost equal) . I used the simpler model K80 to reconstruct sequences ancestral to the gene dupication event in fertilin . A sequence ancestral to the Macaca-Saguinus-Papio fertilin I and fertilin {alpha} II duplication event (see arrow in ) was used to test for codon sites under positive selection using the codemlsites program of PAML.

    Model M8 ( = -9,723.00) fitted the data better than M7 ( = -9,769.90); the test statistic was 2 = 93.8 (df = 2; P < 0.0001), suggesting that positive selection has shaped the evolution of single codon sites between species prior to the fertilin duplication event. The question remains whether positive selection leading to acquisition of new adaptations among duplicates has been responsible for the pattern of positive selection previously detected . Compared to the results obtained when using fertilin {alpha} I and {alpha} II duplicates , the use of an ancestral sequence shows more sites under positive selection within the disintegrin domain (7) but similar numbers within the cysteine-rich (7) and EGF-like domains (1). The actual sites in the Cavia cobaya reference sequence shown in to be under positive selection at a 95% posterior probability threshold level (P > 0.95) when the ancestral fertilin sequence is used are 476, 508, 510, 520, 535, 543, and 546 (disintegrin domain); 588, 591, 594, 611, 644, 672, and 674 (cysteine-rich domain); and 678 (EGF-like domain).

    Therefore, the duplication event of fertilin {alpha} between Papio, Macaca and Saguinus is not responsible for the positive selection signal detected within the cysteine-rich and EGF-like domains. The result also shows that preservation of function within the disintegrin domain among duplicates might have masked a stronger ancestral pattern of positive selection for disintegrin. A previous study has shown that sequence similarity among these three species is low between fertilin I and fertilin II genes for the amino-end (50% to 65% identity) and the carboxy-end (40% identity), whereas they are almost 100% identical in the central region This level of similarity among duplicates may partially explain the weaker positive selection signal originally detected using the fertilin I and II paralogs.@i, 百拇医药

    Other Members of the ADAM Gene Family@i, 百拇医药

    ADAM proteins with a pattern of expression in reproductive tissues have been implicated in sperm maturation (ADAM7, ADAM21, and ADAM28) and sperm egg fusion (ADAM21, fertilin {alpha} and ß, ADAM18). Members of the ADAM gene family are also implicated in other functions such as neurogenesis and myogenesis. Previous studies comparing the pattern of molecular evolution of reproductive versus nonreproductive genes have done so by pooling together different genes into these two classes . This type of approach is particularly sensitive to the gene samples being influenced by characteristics other than their site of expression or function. Studying the pattern of molecular evolution for genes belonging to the same family that have evolved new functions, such as the ADAM genes, offers the advantage that the pool of genes compared is more homogeneous in terms of their evolutionary history.

    Fertilin showed high proportion of nonsynonymous substitutions within adhesion domains such as the disintegrin and cysteine-rich domains. An open question is whether this pattern is common to all ADAM genes. In order to answer this question, I estimated dN and dS for a total of 14 different ADAM genes for which sequences were available for mouse and human. The analysis compared the metalloprotease and disintegrin domain. shows that in 13 out of 14 comparisons, dS is higher in absolute value for the disintegrin than the metalloprotease domain and the probability of this occurring by chance is P = 0.00085. However, the dN values for the disintegrin and metalloprotease domains are split evenly, with the higher dN observed for disintegrin 7 out of 14 times (P = 0.21). Even though the disintegrin domain seems to accumulate a higher proportion of synonymous substitutions for all ADAM genes considered, the proportion of nonsynonymous substitutions shows a more random pattern and so the elevated accumulation of nonsynonymous changes is not necessarily a reflection of a higher rate of synonymous neutral substitutions. When ADAM genes expressed in reproductive tissues (ADAM 18, ADAM 28, fertilin ß, ADAM 21, and ADAM 7) are compared to others, they show nonsignificant differences in dS estimates (Zij estimates with P values > 0.05). However, a significant increase is observed in the proportion of nonsynonymous substitutions for ADAM 7 and fertilin ß and for ADAMs 18 and 28 (except when compared to ADAMs 8 and 15). ADAM 21 shows a pattern of nonsynonymous substitutions more similar to nonreproductive genes (significantly higher than five and nonsignificantly different from four nonreproductive ADAMs). These results show that a high proportion of nonsynonymous replacements might be a common pattern for most ADAM genes with a potential role in sperm-egg interaction and sperm maturation, but whether positive selection drives the accumulation of nonsynonymous replacements in genes other than fertilin remains an open question.

    fig.ommittedjd-|qx, 百拇医药

    Table 4 Estimates of dS and dN (± Standard Errors) in Comparisons Between Mouse and Human ADAM Genes Expressed in Different Tissuesjd-|qx, 百拇医药

    Conclusionjd-|qx, 百拇医药

    Fertilin offers the advantage of a rather well-defined protein structure, making a domain-specific analysis of dN/dS ratio possible. However, the use of window approach analysis suffers a limitation similar to that encountered when the full sequence of a gene is examined. Namely that single sites under positive selection might be masked within the sequence of the domain being analyzed. Such is the case of the disintegrin and cysteine-rich domains of fertilin ß. These domains show a high proportion of synonymous substitutions, making their dN/dS estimates similar or lower than other domains, even when their dN values were approximately twice as large as those of other domains (see ). Single site searches, such as the codon-site search approach as implemented by PAML, offer the possibility of detecting sites under positive selection within a gene region with elevated proportion of synonymous changes. This approach also has the advantage of not relying a priori on domain classification and is therefore particularly useful when little is known about the protein structure. Results from both window and single codon PAML analysis agree on the proteolytic processed domains of both fertilin {alpha} and ß (amino-end and metalloprotease domain) being under stronger selective constraints than other sites. The nature of the constraints could be due to the cleavage of such domains being important for the proper maturation and capacitation of the fertilin peptide. Both approaches also agree on the high level of divergence at the carboxy-end of fertilin {alpha} , although the ambiguity of the alignments in this region of the protein makes it difficult to single out positively selected sites.

    Different studies have suggested adhesion activity for fertilin domains other than disintegrin. suggested a potential role of the cysteine-rich and/or EGF-like domains in sperm-egg adhesion, and the fusion peptide of the fertilin {alpha} protein was previously identified within the cysteine-rich domain . In this article, domains potentially involved in sperm-egg interactions (disintegrin and cysteine-rich domains) show signs of positive selection for both fertilin {alpha} and ß. The sign of positive selection for the disintegrin domain of fertilin {alpha} seems to be partially masked due to strong preservation of sequence and perhaps function within this domain after a recent duplication event in the common ancestor to Papio, Macaca and Saguinus.toonsaj, 百拇医药

    It remains to be determined whether the extremely high degree of sequence differentiation in the carboxy-end of fertilin {alpha} (particularly within the transmembrane and cytoplasmic tail domains) is a consequence of different species having evolved very specific patterns of membrane localization of fertilin in mature sperm. If that is the case, why the signal is only detected for fertilin {alpha} but not fertilin ß will need to be explained.

    Acknowledgementsgz, 百拇医药

    I would like to acknowledge Andy Clark and Brian Lazzaro for their comments and suggestions on the manuscript. Comments received by David Rand and two anonymous reviewers were extremely helpful in improving the original submission. This work was supported by an NSERC grant to A.C.gz, 百拇医药

    Literature Citedgz, 百拇医药

    Aguadé, M. 1998. Different forces drive the evolution of the Acp26Aa and Acp26Ab accessory gland genes in the Drosophila melanogaster species complex. Genetics 150:1079-1080.gz, 百拇医药

    Aguadé, M. 1999. Positive selection drives the evolution of the Acp29AB accessory gland protein in Drosophila. Genetics 152:543-551.gz, 百拇医药

    Aguadé, M., N. Miyashita, C. H. Langley. 1992. Polymorphism and divergence in the Mst26A male accessory gland gene region in Drosophila. Genetics 132:755-770.gz, 百拇医药

    Begun, D. J., P. Whitley, B. L. Todd, H. M. Waldrip-Dail, A. G. Clark. 2000. Molecular population genetics of male accessory gland proteins in Drosophila. Genetics 156:1879-1888.

    Bielawski, J. P., Z. Yang. 2001. Positive and negative selection in the DAZ gene family. Mol. Biol. Evol 18:523-529.)j, 百拇医药

    Blobel, C. P., T. G. Wolfsberg, C. W. Turck, D. G. Myles, P. Primakoff, J. M. White. 1992. A potential fusion peptide and an integrin ligand domain in a protein active in sperm-egg fusion. Nature 356:248-252.)j, 百拇医药

    Cho, C., D. O. Bunch, J.-E. Faure, E. H. Goulding, E. M. Eddy, P. Primakoff, D. G. Myles. 1998. Fertilization defects in sperm from mice lacking fertilin ß. Science 281:1857-1859.)j, 百拇医药

    Civetta, A., R. S. Singh. 1999. Broad-sense sexual selection, sex gene pool evolution, and speciation. Genome 42:1033-1041.)j, 百拇医药

    Cornwall, G. A., N. Hsia. 1997. ADAM7, a member of the ADAM (a disintegrin and metalloprotease) gene family is specifically expressed in the mouse anterior pituitary and epididymus. Endocrinology 138:4262-4272.)j, 百拇医药

    Cowan, A. E., D. E. Koppel, L. A. Vargas, G. R. Hunnicut. 2001. Guinea pig fertilin exhibits restricted lateral mobility in epididymal sperm and becomes freely diffusing during capacitation. Dev. Biol 236:502-509.

    Evans, J. P., R. M. Schultz, G. S. Kopf. 1995. Mouse sperm-egg plasma membrane interactions: analysis of roles of egg integrins and the mouse sperm homologue of PH-30 (fertilin) ß. J. Cell Sci 108:3267-3278.i^vk, 百拇医药

    Evans, J. P. 1998. Roles of the disintegrin domains of mouse fertilins {alpha} and ß in fertilization. Biol. Reprod 59:145-152.i^vk, 百拇医药

    Goldman, N., Z. Yang. 1994. A codon-based model of nucleotide substitution for protein-coding DNA sequences. Mol. Biol. Evol 11:725-736.i^vk, 百拇医药

    Hunnicut, G. R., D. E. Koppel, D. G. Myles. 1997. Analysis of the process of localization of fertilin to the sperm posterior head plasma membrane domain during sperm maturation in the epididymis. Dev. Biol 191:146-159.i^vk, 百拇医药

    Huovila, A. P., E. A. Almeida, J. M. White. 1996. ADAMs and cell fusion. Cell Biol 8:692-699.i^vk, 百拇医药

    Jury, J. A., J. Frayne, L. Hall. 1998. Sequence analysis of a variety of primate fertilin {alpha} genes: evidence for non-functional genes in the gorilla and man. Mol. Rep. Dev 51:92-97.

    Karn, R. C., M. W. Nachman. 1999. Reduced nucleotide variability at an androgen-binding protein locus (Abpa) in house mice: evidence for positive natural selection. Mol. Biol. Evol 16:1192-1197.sdd(, 百拇医药

    Kumar, S., K. Tamura, I. B. Jakobsen, M. Nei. 2001. MEGA2: molecular evolutionary genetics analysis software. Bioinformatics 17:1244-1245.sdd(, 百拇医药

    Lee, Y.-H., T. Ota, V. D. Vacquier. 1995. Positive selection is a general phenomenon in the evolution of abalone sperm lysin. Mol. Biol. Evol 12:231-238.sdd(, 百拇医药

    Lynch, M., A. Force. 2000. The probability of duplicate gene preservation by subfunctionalization. Genetics 15:459-473.sdd(, 百拇医药

    McLeskey, S. B., C. Dowds, R. Carballada, R. R. White, P. M. Saling. 1998. Molecules involved in mammalian sperm-egg interaction. Int. Rev. Cyt 177:57-113.sdd(, 百拇医药

    Metz, E. C., S. R. Palumbi. 1996. Positive selection and sequence rearrangements generate extensive polymorphism in the gamete recognition protein bindin. Mol. Biol. Evol 13:397-406.sdd(, 百拇医药

    Morgenstern, B., A. Dress, T. Werner. 1996. Multiple DNA and protein sequence alignment based on segment-to-segment comparison. Proc. Natl. Acad. Sci. USA 93:12098-12103.

    Myles, D. G., P. Primakoff. 1997. Why did the sperm cross the cumulus?. To get to the oocyte. Functions of the sperm surface proteins PH-20 and fertilin in arriving at, and fusing with, the egg. Biol. Reprod 56:320-327.3%7s&, 百拇医药

    Nei, M., T. Gojobori. 1986. Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol. Biol. Evol 3:418-426.3%7s&, 百拇医药

    Nei, M., S. Kumar. 2000. Molecular evolution and phylogenetics. Oxford University Press, New York.3%7s&, 百拇医药

    Nielsen, R., Z. Yang. 1998. Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene. Genetics 148:929-936.3%7s&, 百拇医药

    O'Neill, R. J. W., M. D. B. Eldridge, R. H. Crozier, J. A. M. Graves. 1997. Low levels of sequence divergence in rock wallabies (Petrogale) suggest a lack of positive directional selection in Sry.. Mol. Biol. Evol 14:350-353.3%7s&, 百拇医药

    Perry, A. C., P. M. Gichuhi, R. Jones, L. Hall. 1995. Cloning and analysis of monkey fertilin reveals novel alpha subunit isoforms. Biochem. J 307:843-850.

    Primakoff, P., H. Hyatt, J. Tredick-Kline. 1987. Identification and purification of a sperm surface protein with a potential role in sperm-egg membrane fusion. J. Cell Biol 104:141-149.%!2, 百拇医药

    Primakoff, P., D. G. Myles. 2000. The ADAM gene family: surface proteins with adhesion and protease activity. Trends Genet 16:83-87.%!2, 百拇医药

    Swanson, W. J., A. G. Clark, H. M. Waldrip-Dail, M. F. Wolfner, C. F. Aquadro. 2001a. Evolutionary EST analysis identifies rapidly evolving male reproductive proteins in Drosophila. Proc. Natl. Acad. Sci. USA 98:7375-7379.%!2, 百拇医药

    Swanson, W. J., Z. Yang, M. F. Wolfner, C. F. Aquadro. 2001b. Positive Darwinian selection drives the evolution of several female reproductive proteins in mammals. Proc. Natl. Acad. Sci. USA 98:2509-2514.%!2, 百拇医药

    Thompson, J. D., T. J. Gibson, F. Plewniak, F. Jeanmougin, D. G. Higgins. 1997. The ClustalX windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 24:4876-4882.%!2, 百拇医药

    Ting, C.-T., S.-C. Tsaur, M.-L. Wu, C.-I. Wu. 1998. A rapidly evolving homeobox at the site of a hybrid sterility gene. Science 282:1501-1504.

    Tsaur, S.-C., C.-I. Wu. 1997. Positive selection and the molecular evolution of a gene of male reproduction, Acp26Aa of Drosophila. Mol. Biol. Evol 14:544-549.[, http://www.100md.com

    Tucker, P. K., B. L. Lundrigan. 1993. Rapid evolution of the sex determining locus in old world mice and rats. Nature 364:715-717.[, http://www.100md.com

    Whitfield, L. S., R. Lovell-Badge, P. N. Goodfellow. 1993. Rapid sequence evolution of the mammalian sex-determining gene SRY.. Nature 364:713-715.[, http://www.100md.com

    Wolfsberg, T. G., P. D. Straight, R. L. Gerena, A.-P. J. Huovila, P. Primakoff, D. G. Myles, J. M. White. 1995. ADAM, a widely distributed and developmentally regulated gene family encoding membrane proteins with a disintegrin and metalloprotease domain. Dev. Biol 169:378-383.[, http://www.100md.com

    Wong, G. E., X. Zhu, C. E. Praters, E. Oh, J. P. Evans. 2001. Analysis of fertilin {alpha} (ADAM1)-mediated sperm-egg cell adhesion during fertilization and identification of an adhesion-mediating sequence in the disintegrin-like domain. J. Biol Chem 276:24937-24945.[, http://www.100md.com

    Wyckoff, G. J., W. Wang, C.-I. Wu. 2000. Rapid evolution of male reproductive genes in the descent of man. Nature 403:304-309.

    Yang, Z. 1993. Maximum-likelihood estimation of phylogeny from DNA sequences when substitution rates differ over sites. Mol. Biol. Evol 10:1396-1401.}]9@, http://www.100md.com

    Yang, Z. 1997. PAML: a program package for phylogenetic analysis by maximum likelihood. Comput. Appl. Biosci 13:555-556.}]9@, http://www.100md.com

    Yang, Z., N. Goldman, A. Friday. 1994. Comparison of models for nucleotide substitution used in maximum-likelihood phylogenetic estimation. Mol. Biol. Evol 11:316-324.}]9@, http://www.100md.com

    Yang, Z., R. Nielsen, N. Goldman, A.-M. K. Pedersen. 2000. Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics 155:431-449.}]9@, http://www.100md.com

    Yuan, R., P. Primakoff, D. G. Myles. 1997. A role for the disintegrin domain of cyritestin, a sperm surface protein belonging to the ADAM family, in mouse sperm-egg plasma membrane adhesion and fusion. J. Cell Biol 137:105-112.}]9@, http://www.100md.com

    Accepted for publication August 29, 2002.(Alberto Civetta)