I'm interested in the mechanisms and causes of species divergence and differentiation, in an ecological/evolutionary context, and over both very long and very short evolutionary time scales. What are the genomic changes that accompany adaptation to different niches? How can these changes be identified? Why do changes occur between species, and to what extent are these changes adaptive? I'm currently developing genome-wide, comparative-genomic approaches to look for evidence of positive selection on protein-coding genes (or other genomic loci), with the more distant goal of relating this evidence to phenotypic changes between species.
1. Selective Signatures: An approach to detect species-specific, gene-specific Natural Selection
In this simple, empirical technique to identify proteins under an unusual regime of selection, we assume that most of the evolutionary change in a protein is due to 2 factors: the overall rate of evolution in the genome (or species) to which it belongs, and the rate of evolution of its protein family (or orthologous group). Deviations from the rate predicted by these factors, can be interpreted as evidence for positive or relaxed negative selection (when substitution rates are unusually rapid), or strong negative selection (when substitution rates are unusually slow).
The approach requires a set of orthologous proteins from a group of related species, each with a gene tree and branch lengths. The output is a set of candidate proteins under species-specific selection.
We recently applied the 'selective signatures' technique to a set of ~1000 gene families from 30 species of Gammaproteobacteria, and showed how it can discern functional relationships between genes:
Shapiro BJ, Alm EJ (2008) Comparing Patterns of Natural Selection across Species Using Selective Signatures . PLoS Genetics 4(2): e23 doi:10.1371/journal.pgen.0040023 http://dx.doi.org/10.1371/journal.pgen.0040023
2. Slow:Fast substitution ratios
- Development of this new method, complementary to selective signatures is underway.
- The principle is that substitutions in conserved, 'slow-evolving' sites of a protein are unexpected without invoking some type of protein-level natural selection and functional adaptation. Therefore, we define the S:F ratio (substitutions per slow site / substitutions per fast site) as a simple metric to detect deviations from the expected neutral substitution rate.
- S:F is essentially a more general version of the widely-used dN/dS ratio to detect protein-level selection
Main findings so far, based on a data set of ~1000 orthologous proteins from 30 species of Gammaprotebacteria:
- the S:F ratio behaves similarly to the dN/dS ratio in detecting selection on codon data, and may also detect instances of protein-level selection not discernible by dN/dS
- genes involved in cell envelope biogenesis, ion tranport and metabolism, signal transduction, and especially motility and secretion, are frequent targets of selection in our genome-wide comparative study of γ-proteobacteria
- the evolution of some species is dominated by selection on specific gene functions (e.g. energy production in Pseudomonas fluorescens), whereas other species evolve mostly by drift (e.g. elevated S:F across most gene function in Buchnera spp.)
- potentially adaptive substitutions identified by the S:F approach can be mapped to changes in protein structure and function
jesse1 at mit dot edu