Multilocus patterns of polymorphism and selection across the X chromosome of Caenorhabditis remanei

Cutter AD

Genetics 2008 Mar;178(3):1661-72

PubMed PMID: 18245859

Abstract

Natural selection and neutral processes such as demography, mutation, and gene conversion all contribute to patterns of polymorphism within genomes. Identifying the relative importance of these varied components in evolution provides the principal challenge for population genetics. To address this issue in the nematode Caenorhabditis remanei, I sampled nucleotide polymorphism at 40 loci across the X chromosome. The site-frequency spectrum for these loci provides no evidence for population size change, and one locus presents a candidate for linkage to a target of balancing selection. Selection for codon usage bias leads to the non-neutrality of synonymous sites, and despite its weak magnitude of effect (N(e)s approximately 0.1), is responsible for profound patterns of diversity and divergence in the C. remanei genome. Although gene conversion is evident for many loci, biased gene conversion is not identified as a significant evolutionary process in this sample. No consistent association is observed between synonymous-site diversity and linkage-disequilibrium-based estimators of the population recombination parameter, despite theoretical predictions about background selection or widespread genetic hitchhiking, but genetic map-based estimates of recombination are needed to rigorously test for a diversity-recombination relationship. Coalescent simulations also illustrate how a spurious correlation between diversity and linkage-disequilibrium-based estimators of recombination can occur, due in part to the presence of unbiased gene conversion. These results illustrate the influence that subtle natural selection can exert on polymorphism and divergence, in the form of codon usage bias, and demonstrate the potential of C. remanei for detecting natural selection from genomic scans of polymorphism.

Divergence times in Caenorhabditis and Drosophila inferred from direct estimates of the neutral mutation rate

Cutter AD

Mol. Biol. Evol. 2008 Apr;25(4):778-86

PubMed PMID: 18234705

Abstract

Accurate inference of the dates of common ancestry among species forms a central problem in understanding the evolutionary history of organisms. Molecular estimates of divergence time rely on the molecular evolutionary prediction that neutral mutations and substitutions occur at the same constant rate in genomes of related species. This underlies the notion of a molecular clock. Most implementations of this idea depend on paleontological calibration to infer dates of common ancestry, but taxa with poor fossil records must rely on external, potentially inappropriate, calibration with distantly related species. The classic biological models Caenorhabditis and Drosophila are examples of such problem taxa. Here, I illustrate internal calibration in these groups with direct estimates of the mutation rate from contemporary populations that are corrected for interfering effects of selection on the assumption of neutrality of substitutions. Divergence times are inferred among 6 species each of Caenorhabditis and Drosophila, based on thousands of orthologous groups of genes. I propose that the 2 closest known species of Caenorhabditis shared a common ancestor <24 MYA (Caenorhabditis briggsae and Caenorhabditis sp. 5) and that Caenorhabditis elegans diverged from its closest known relatives <30 MYA, assuming that these species pass through at least 6 generations per year; these estimates are much more recent than reported previously with molecular clock calibrations from non-nematode phyla. Dates inferred for the common ancestor of Drosophila melanogaster and Drosophila simulans are roughly concordant with previous studies. These revised dates have important implications for rates of genome evolution and the origin of self-fertilization in Caenorhabditis.

Evolution of the type III secretion system and its effectors in plant-microbe interactions

McCann HC, Guttman DS

New Phytol. 2008;177(1):33-47

PubMed PMID: 18078471

Abstract

Many bacterial plant pathogens require the type III secretion system (T3SS) and its effector proteins (T3SEs) to invade and extract nutrients from their hosts successfully. While the molecular function of this system is being studied intensively, we know comparatively little about the evolutionary and ecological pressures governing its fate over time, and even less about the detailed mechanisms underlying and driving complex T3SS-mediated coevolutionary dynamics. In this review we summarize our current understanding of how host-pathogen interactions evolve, with a particular focus on the T3SS of bacterial plant pathogens. We explore the evolutionary origins of the T3SS relative to the closely related flagellar system, and investigate the evolutionary pressures on this secretion and translocation apparatus. We examine the evolutionary forces acting on T3SEs, and compare the support for vertical descent with modification of these virulence-associated systems (pathoadaptation) vs horizontal gene transfer. We address the evolutionary origins of T3SEs from the perspective of both the evolutionary mechanisms that generate new effectors, and the mobile elements that may be the source of novel genetic material. Finally, we propose a number of questions raised by these studies, which may serve to guide our thinking about these complex processes.

Host-pathogen interplay and the evolution of bacterial effectors

Stavrinides J, McCann HC, Guttman DS

Cell. Microbiol. 2008 Feb;10(2):285-92

PubMed PMID: 18034865

Abstract

Many bacterial pathogens require a type III secretion system (T3SS) and suite of type III secreted effectors (T3SEs) to successfully colonize their hosts, extract nutrients and consequently cause disease. T3SEs, in particular, are key components of the bacterial arsenal, as they function directly inside the host to disrupt or suppress critical components of the defence network. The development of host defence and surveillance systems imposes intense selective pressures on these bacterial virulence factors, resulting in a host-pathogen co-evolutionary arms race. This arms race leaves its genetic signature in the pattern and structure of natural genetic variation found in T3SEs, thereby permitting us to infer the specific evolutionary processes and pressures driving these interactions. In this review, we summarize our current knowledge of T3SS-mediated host-pathogen co-evolution. We examine the evolution of the T3SS and the T3SEs that traverse it, in both plant and animal pathosystems, and discuss the processes that maintain these important pathogenicity determinants within pathogen populations. We go on to examine the possible origins of T3SEs, the mechanisms that give rise to new T3SEs and the processes that underlie their evolution.

Combining classifiers to predict gene function in Arabidopsis thaliana using large-scale gene expression measurements

Lan H, Carson R, Provart NJ, Bonner AJ

BMC Bioinformatics 2007;8:358

PubMed PMID: 17888165

Abstract

BACKGROUND: Arabidopsis thaliana is the model species of current plant genomic research with a genome size of 125 Mb and approximately 28,000 genes. The function of half of these genes is currently unknown. The purpose of this study is to infer gene function in Arabidopsis using machine-learning algorithms applied to large-scale gene expression data sets, with the goal of identifying genes that are potentially involved in plant response to abiotic stress.

RESULTS: Using in house and publicly available data, we assembled a large set of gene expression measurements for A. thaliana. Using those genes of known function, we first evaluated and compared the ability of basic machine-learning algorithms to predict which genes respond to stress. Predictive accuracy was measured using ROC50 and precision curves derived through cross validation. To improve accuracy, we developed a method for combining these classifiers using a weighted-voting scheme. The combined classifier was then trained on genes of known function and applied to genes of unknown function, identifying genes that potentially respond to stress. Visual evidence corroborating the predictions was obtained using electronic Northern analysis. Three of the predicted genes were chosen for biological validation. Gene knockout experiments confirmed that all three are involved in a variety of stress responses. The biological analysis of one of these genes (At1g16850) is presented here, where it is shown to be necessary for the normal response to temperature and NaCl.

CONCLUSION: Supervised learning methods applied to large-scale gene expression measurements can be used to predict gene function. However, the ability of basic learning methods to predict stress response varies widely and depends heavily on how much dimensionality reduction is used. Our method of combining classifiers can improve the accuracy of such predictions – in this case, predictions of genes involved in stress response in plants – and it effectively chooses the appropriate amount of dimensionality reduction automatically. The method provides a useful means of identifying genes in A. thaliana that potentially respond to stress, and we expect it would be useful in other organisms and for other gene functions.

The chimeric cyclic nucleotide-gated ion channel ATCNGC11/12 constitutively induces programmed cell death in a Ca2+ dependent manner

Urquhart W, Gunawardena AH, Moeder W, Ali R, Berkowitz GA, Yoshioka K

Plant Mol. Biol. 2007 Dec;65(6):747-61

PubMed PMID: 17885810

Abstract

The hypersensitive response (HR) involves programmed cell death (PCD) in response to pathogen infection. To investigate the pathogen resistance signaling pathway, we previously identified the Arabidopsis mutant cpr22, which displays constitutive activation of multiple defense responses including HR like cell death. The cpr22 mutation has been identified as a 3 kb deletion that fuses two cyclic nucleotide-gated ion channel (CNGC)-encoding genes, ATCNGC11 and ATCNGC12, to generate a novel chimeric gene, ATCNGC11/12. In this study, we conducted a characterization of cell death induced by transient expression of ATCNGC11/12 in Nicotiana benthamiana. Electron microscopic analysis of this cell death showed similar characteristics to PCD, such as plasma membrane shrinkage and vesicle formation. The hallmark of animal PCD, fragmentation of nuclear DNA, was also observed in ATCNGC11/12-induced cell death. The development of cell death was significantly suppressed by caspase-1 inhibitors, suggesting the involvement of caspases in this process. Recently, vacuolar processing enzyme (VPE) was isolated as the first plant caspase-like protein, which is involved in HR development. In VPE-silenced plants development of cell death induced by ATCNGC11/12 was much slower and weaker compared to control plants, suggesting the involvement of VPE as a caspase in ATCNGC11/12-induced cell death. Complementation analysis using a Ca2+ uptake deficient yeast mutant demonstrated that the ATCNGC11/12 channel is permeable to Ca2+. Additionally, calcium channel blockers such as GdCl3 inhibited ATCNGC11/12-induced HR formation, whereas potassium channel blockers did not. Taken together, these results indicate that the cell death that develops in the cpr22 mutant is indeed PCD and that the chimeric channel, ATCNGC11/12, is at the point of, or up-stream of the calcium signal necessary for the development of HR.

Sentinels at the wall: cell wall receptors and sensors

Humphrey TV, Bonetta DT, Goring DR

New Phytol. 2007;176(1):7-21

PubMed PMID: 17803638

Abstract

The emerging view of the plant cell wall is of a dynamic and responsive structure that exists as part of a continuum with the plasma membrane and cytoskeleton. This continuum must be responsive and adaptable to normal processes of growth as well as to stresses such as wounding, attack from pathogens and mechanical stimuli. Cell expansion involving wall loosening, deposition of new materials, and subsequent rigidification must be tightly regulated to allow the maintenance of cell wall integrity and co-ordination of development. Similarly, sensing and feedback are necessary for the plant to respond to mechanical stress or pathogen attack. Currently, understanding of the sensing and feedback mechanisms utilized by plants to regulate these processes is limited, although we can learn from yeast, where the signalling pathways have been more clearly defined. Plant cell walls possess a unique and complicated structure, but it is the protein components of the wall that are likely to play a crucial role at the forefront of perception, and these are likely to include a variety of sensor and receptor systems. Recent plant research has yielded a number of interesting candidates for cell wall sensors and receptors, and we are beginning to understand the role that they may play in this crucial aspect of plant biology.

An “Electronic Fluorescent Pictograph” browser for exploring and analyzing large-scale biological data sets

Winter D, Vinegar B, Nahal H, Ammar R, Wilson GV, Provart NJ

PLoS ONE 2007;2(8):e718

PubMed PMID: 17684564

Abstract

BACKGROUND: The exploration of microarray data and data from other high-throughput projects for hypothesis generation has become a vital aspect of post-genomic research. For the non-bioinformatics specialist, however, many of the currently available tools provide overwhelming amounts of data that are presented in a non-intuitive way.

METHODOLOGY/PRINCIPAL FINDINGS: In order to facilitate the interpretation and analysis of microarray data and data from other large-scale data sets, we have developed a tool, which we have dubbed the electronic Fluorescent Pictograph – or eFP – Browser, available at http://www.bar.utoronto.ca/, for exploring microarray and other data for hypothesis generation. This eFP Browser engine paints data from large-scale data sets onto pictographic representations of the experimental samples used to generate the data sets. We give examples of using the tool to present Arabidopsis gene expression data from the AtGenExpress Consortium (Arabidopsis eFP Browser), data for subcellular localization of Arabidopsis proteins (Cell eFP Browser), and mouse tissue atlas microarray data (Mouse eFP Browser).

CONCLUSIONS/SIGNIFICANCE: The eFP Browser software is easily adaptable to microarray or other large-scale data sets from any organism and thus should prove useful to a wide community for visualizing and interpreting these data sets for hypothesis generation.

Convergent evolution of phytopathogenic pseudomonads onto hazelnut

Wang PW, Morgan RL, Scortichini M, Guttman DS

Microbiology (Reading, Engl.) 2007 Jul;153(Pt 7):2067-73

PubMed PMID: 17600051

Abstract

Pseudomonas syringae pv. avellanae (synonym: P. avellanae, Pav) is the causal agent of hazelnut decline in Greece and Italy. The population structure and evolutionary relationships of 22 strains from these two countries were examined by multilocus sequence typing (MLST) of four housekeeping genes (gapA, gltA, gyrB and rpoD). Neighbour-joining and maximum-likelihood phylogenetic analysis revealed that Greek strains isolated from the original 1976 outbreak of hazelnut decline through 1990 were very similar to Italian strains isolated from 2002 through 2004. Other Italian strains that were isolated during the 1990s were very homogeneous and clustered in a clade that was quite distinct from the Greek isolates and Italian isolates from the 2000s. A split decomposition analysis found evidence for recombination between these two highly divergent clades in two of the four MLST housekeeping genes. Incorporating these data into a broad MLST analysis of the P. syringae species complex showed that the Pav Greek and Italian strains from the 2000s clustered with P. syringae phylogroup 1, which is predominantly composed of pathogens of tomato and Brassicaceae hosts, while the Pav Italian strains from the 1990s clustered in P. syringae phylogroup 2 and are most closely related to pea (Pisum sativum L.) pathogens. These results clearly indicate that the ability to infect hazelnuts has arisen twice. This evolutionary process may be due to de novo adaptation to hazelnut by local P. syringae strains (such as the colonizers of Leguminosae crops), or the result of genetic exchange from the original Greek Pav clonal group into a phylogroup 2 strain. The latter explanation is intriguing since there is no exchange of hazelnut propagative material between Italy and Greece, which would be a likely vector for the movement of these pathogens.

Combining population genomics and quantitative genetics: finding the genes underlying ecologically important traits

Stinchcombe JR, Hoekstra HE

Heredity (Edinb) 2008 Feb;100(2):158-70

PubMed PMID: 17314923

Abstract

A central challenge in evolutionary biology is to identify genes underlying ecologically important traits and describe the fitness consequences of naturally occurring variation at these loci. To address this goal, several novel approaches have been developed, including ‘population genomics,’ where a large number of molecular markers are scored in individuals from different environments with the goal of identifying markers showing unusual patterns of variation, potentially due to selection at linked sites. Such approaches are appealing because of (1) the increasing ease of generating large numbers of genetic markers, (2) the ability to scan the genome without measuring phenotypes and (3) the simplicity of sampling individuals without knowledge of their breeding history. Although such approaches are inherently applicable to non-model systems, to date these studies have been limited in their ability to uncover functionally relevant genes. By contrast, quantitative genetics has a rich history, and more recently, quantitative trait locus (QTL) mapping has had some success in identifying genes underlying ecologically relevant variation even in novel systems. QTL mapping, however, requires (1) genetic markers that specifically differentiate parental forms, (2) a focus on a particular measurable phenotype and (3) controlled breeding and maintenance of large numbers of progeny. Here we present current advances and suggest future directions that take advantage of population genomics and quantitative genetic approaches – in both model and non-model systems. Specifically, we discuss advantages and limitations of each method and argue that a combination of the two provides a powerful approach to uncovering the molecular mechanisms responsible for adaptation.