Julius B. Lucks/Meetings and Notes/SMBE2007/models protein evolution 2
From OpenWetWare
Jump to navigationJump to search
Ziheng Yang : A mutation selection model of codon substitution
Tue Jun 26 14:02:43 EDT 2007
- first one to propose codon models - analyze codon changes dN/dS (rather than nucleotides and AAs)
- UCL
- Rasmus Nielsen - Univ. Copenhagen
- Alan Moses thinks this will be a revolutionary talk
Words to Look Up
Talk
- Goldman & Yang 1994 model - codon substitution model
- Mol. Biol. Evol., 11, 725, 1994
- Yang and Nielsen, 1998, J. Mol. Evol.
- Subst. rate to codon j proportional to equil. freq of codon j
- does not separate mutational bias and selection on codon usage
- Bierne & Eyre-Walker, 2003, Genetics, 165, 1587
- Yang 2006, Computational Molecular Evolution, Oxford, 284
- TTT,TTC,TCT,TCC transitions - only 2 rates
- TTC, TCC preferred, others rare
- realistically want 3 rates
- model with 3 rates - Neilsen et al. - 20007, Mol. Biol. Evol., 24, 228
- large rate from unpreferred to pref
- smal rate for reverse
- middle rate from pref to pref and unpref to unpref
- requires a prioir partitioning of codons (Hiroshi Akashi)
- Codon usage gen believed to be under selection in bacteria and Drosophila
- mammals case is less clear
- Akashi H, 1994, Genetics
- synonymous changes change protein structure and function
- Kimchi-Sarfaty, 2007, Science, 315, 525
- Komar AA, 2007, Scienc, 315, 466
- protein folding co-translational
- silent SNP - altered protein translation kinetics - final protein diff conformation and function
Model
- mutation rate from nucl i to j described by HKY85 or GTR (REV) applied to all 3 processes
- [math]\displaystyle{ \mu_{ij} = a_{ij}\pi_j^* }[/math] - a's symmetric
- [math]\displaystyle{ \pi^* }[/math] mutational bias parameters
- codons [math]\displaystyle{ I = i_1i_2i_3 }[/math]
- fixation probability function of selection coefficient
- Kimura M, 1962, Genetics, 47, 713 - use Kimura formula
- [math]\displaystyle{ S_{ij} = 2Ns_{ij} = 2N(f_j - f_i) }[/math]
- N number of chromosomes
- Kimura M, 1962, Genetics, 47, 713 - use Kimura formula
- Selection on protein is modeled using [math]\displaystyle{ \omega }[/math]
- parameters in the model
- 4 mutation rates
- 60 codon fitness parameters
- sequence distance or branch lengths
- time reversible
- markov change tr iff rate matrix is product of symmetrical matrix and diagonal matrix
- equil rate of codon [math]\displaystyle{ \pi_j \propto (\pi_{j1}^*\pi_{j2}^*\pi_{j3}^*)e^{F_J} }[/math]
- comments
- use of omega to detect selection on the protein does not rely on assump that synon sites evolve neutrally
- old medels in codeml such as F1x4, F3x4, Fcodon - not special cases of mutation selection model
- Muse and Gaut, 1994 ,mBE , 11, 715
Results
- why little correlation between [math]\displaystyle{ \omega_{human-macaque} }[/math] and [math]\displaystyle{ \omega_{mouse-rat} }[/math]?
- liklihood ration test of selection on synonymous codon usage
- null model assumes synonymous codons have same fitness
- most genes are under selection of codon usage
Summary
- estimation of distances using old models fine
- in most (90%) genes - sig evidence for nat selection driving evol of codon usage
- most mutations have fitness in range |S| < 1 or 2, implying weak selection on codon usage or nearly neutral evolution
Questions
- drosophila and bacteria - codon bias and gene expression found
- expts - optimal codons can use in bacteria - use to translate more eff
- mammals not as clear
Tal Pupko : An evolutionary model that accounts for selection on synonymous mutations
Tue Jun 26 14:03:00 EDT 2007
- Cell Res Immunology - Tel-Aviv
- Ka/Ks webserver
- collaborated with Nir Friedman
Words to Look Up
- positive selection vs. purifying selection
Talk
- codon models
- enference of evel selection forces on a protein
- purify selection
- phylogeny
- converting empirical AA replacement matrices into codon-based subst matrices
- methods for computing Ka/Ks
- subst. matrix rates 61x61
- Yang's M model (2000)
- K - transition/transversion ratio
- [math]\displaystyle{ \Pi }[/math] - codon frequency
- w - factor of selection
- problems
- asummes rate of leu (UUG) to tryp (UGG) = rate leu (UUG) to phe (UUU) (single transvertion)
- 1st 5 times more likely
- model does not account for exact identity of AA
- assumes instan rate betwiin two AAs that differ by one mutation ...
- asummes rate of leu (UUG) to tryp (UGG) = rate leu (UUG) to phe (UUU) (single transvertion)
- propose model
- Mechanistic Empirical Combined (MEC)
- exapand 20x20 empirical AA matrix into 61x61 codon matrix
- assumptions
- sum of rate of all codons = sum of rates of AAs, but take into account codon and AA probabilities
- intensity of selection - omega - assume gamma distributed
Ks conservation
- most models assume Ks (synonymous) same for all sites(reflects neutral rate of evolution)
- is this true?
HIV
- vif and pol overlap in diff frames - reduced Ks in these regions
Further
- large scale search for conserved ks in mammals, viruses, bacteria and yeasts
- impact of Ks conservation on positive selection inference
- charcterization of conserved Ks regions
- Goren, Mol. Cell, 2006
Questions
Claudia Kleinman : Protein structure and sequence evolution - statistical potentials for phylogeny
Tue Jun 26 14:03:26 EDT 2007
Words to Look Up
Talk
- probabilistic models of sequence evolution
- try to incorporate protein structure explicitly into the models
- site-dependant approaches
- simulation of evolution: Parisi & Echave 2001
- statistical potential
- knowledge-based energy function derived from analysis of known protein structures
- [math]\displaystyle{ Q_{lm}r_le^{\beta(G_l - G_m)} }[/math]
- coarse grain structure
- accounts for implicitly for poorly understood complex effects
- pairwise potential that depends on distance between residues (w/ solvent accessibility potentials)
- contact potentials
- optimized for structure prediction problem
Devise stat potential for an evol context
- [math]\displaystyle{ E = E_{contact} + E_{solvent} + E_{torsion} + E_{SS} }[/math]
- derive contact map (binary n.n.'s)
- contact energy parameter
- solvent accessibility - arb # of classes
- torsion - use main chain angles
- derive contact map (binary n.n.'s)
- Kleinman, BMC, 7, 326, 2006
- likelihood proportional to exp of negative of this energy (chemical potential)
- maximize (maximum likelihood)
- estimate gradient by MCMC - follow to find the maximum
model comparison using Bayes factors
- Rodrigue, MBE, 23, 1762, 2006
- Poisson distr ref model
- thermodynamic integration
Questions
Mario Fares : The three-dimensionality of molecular evolution
Tue Jun 26 14:03:53 EDT 2007
- Trinity College Dublin
Words to Look Up
Talk
- detecting selective constraints in protein-coding genes: survival of the fittest
- [math]\displaystyle{ \omega = dN/dS }[/math]
- [math]\displaystyle{ \omega \lt 1\gt }[/math] - purifying
- [math]\displaystyle{ \omega \gt 1\gt }[/math] - positive selection
Questions
Allan Drummond : Modeling evolution when ribosomes fail
Tue Jun 26 14:04:15 EDT 2007
- w/ Claus Wilke - UT Austin
Words to Look Up
Talk
- ribosomes fail - don't ignore when model protein evol
- near-universal observations
- coding sequences evolve at very diff rates
- dN and dS correlate
- high expressed proteins evolve slowly
- codons matching abundant tRNAs preferred
- high expressed genes
- conserved sites (Akashi 1994 - trans accuracy)
- codon biased genes have fewer dS and dN
- in matrix form - matrices look like block structure
- bad news - not independant - PCA would predict just one factor
- 1 protein in 5 mistranslated
- can still fold
- or can misfold
- selection can act to favor protein sequences that are robust to mistranslation
- certain codons translated 6 times more accurately (model)
- lattice protein model
- Bloom, PNAS (2005,2006)
- Taverna, Goldstein, Proteins (2001)
- anything within 5 kCal/mol of gs will fold
- translational selection alone sufficient to explain the observed correlation matrix patters
- Akashi 1994
- select for speed - don't matter where opt codons are - have to go through all of them
- select for acc - should put opt codons at most highly conserved sites
- this allows a within gene test to see what matters most
Conclusions
- evol rate should be considered as regulatory as well as functional signal
- translational selection suffices to explain many evol patterns
- brute-force modelling of protein evol possible