Difference between revisions of "Drummond:PopGen"

From OpenWetWare
Jump to: navigation, search
(Continuous rate of change: logit)
(Continuous rate of change)
 
(7 intermediate revisions by the same user not shown)
Line 112: Line 112:
 
|<math>= s p(t)(1-p(t))\!</math>
 
|<math>= s p(t)(1-p(t))\!</math>
 
|}
 
|}
This result says that the proportion of type 1 <math>p</math> changes most rapidly when <math>p=0.5</math> and most slowly when <math>p</math> is very close to 0 or 1.  
+
This result says that the proportion of type 1, <math>p</math>, changes most rapidly when <math>p=0.5</math> and most slowly when <math>p</math> is very close to 0 or 1.
  
 
==Evolution is linear on a log-odds scale==
 
==Evolution is linear on a log-odds scale==
The logit function <math>\mathrm{logit} (p) = \ln {p \over 1-p}</math>, which takes <math>p \in [0,1] \to \mathbb{R}</math>, induces a more natural space for considering changes in frequencies.  Rather than tracking the proportion of type 1 or 2, we instead track their log odds.  In logit terms, with <math>L_p(t) \equiv \mathrm{logit} (p(t))</math>,
+
The logit function <math>\mathrm{logit} (p) = \ln {p \over 1-p}</math>, which takes <math>p \in [0,1] \to \mathbb{R}</math>, induces a more natural space for considering changes in frequencies.  Rather than tracking the proportion of type 1 or 2, we instead track their log odds.  In logit terms, with <math>L_p(t) \equiv \mathrm{logit} (p(t))\!</math>,
  
 
:{|
 
:{|
Line 125: Line 125:
 
|-
 
|-
 
|
 
|
|<math>= {\partial  \over \partial t}\left(\ln e^{st}\right)</math>
+
|<math>= {\partial  \over \partial t}\left(\ln {n_1(0) \over n_2(0)} e^{st}\right)</math>
 
|-
 
|-
 
|
 
|
Line 131: Line 131:
 
|}
 
|}
  
That is, <math>L_p(t)</math>, the log-odds of finding type 1 in a random draw from the population, changes linearly in time with slope <math>s</math>.  This differential equation has the solution
+
This differential equation <math>L_p'(t) = s</math> has the solution
  
<math>L_p(t) = L_p(0)e^{st}\!</math>
+
:<math>L_p(t) = L_p(0) + st\!</math>
  
showing that the log-odds of finding type 1 changes exponentially in time, increasing if <math>s>0</math> and decreasing if <math>s<0</math>.
+
showing that the log-odds of finding type 1 changes linearly in time, increasing if <math>s>0</math> and decreasing if <math>s<0</math>.
  
 
==Diffusion approximation==
 
==Diffusion approximation==
 
Insert math here.
 
Insert math here.
 +
 +
==Statistical analysis of relative growth rates==
 +
We have three strains, <math>i</math>, <math>j</math> and <math>r</math>, where <math>r</math> is a reference strain.
 +
Strains <math>i</math> and <math>j</math> have fitness <math>w_i = e^{r_i}</math> and <math>w_j=e^{r_j}</math>.  Define the selection coefficient <math>s_{ij} = \ln \frac{w_i}{w_j} = r_i - r_j</math> as usual.
 +
We have data consisting of triples (<math>g=</math>number of generations, <math>n_i=</math>number of cells of type <math>i</math>, <math>n_r=</math>number of cells of type <math>r</math>).
 +
We have data consisting of pairs (<math>g=</math>number of generations, <math>p_{ir}= n_i/n_r</math>) where <math>n_i</math>=number of cells of type <math>i</math> and <math>n_r=</math>number of cells of type <math>r</math>.
 +
 +
What is the best estimate, and error, on <math>s_{ij}</math>?
 +
 +
===Model===
 +
Assuming exponential growth, <math>\ln p_{ir} = </math>
 +
 +
Let <math>\Pr(s_{ij}=t) = \mathcal{N}(t;\mu_{ij}, \sigma^2_{ij})</math>.
 +
 +
===Maximum-likelihood approach===
 +
Add text.
 +
 +
===Bayesian approach===
 +
Add text.

Latest revision as of 19:40, 28 March 2011

Introduction

Here I will treat some basic questions in population genetics. For personal reasons, I tend to include all the algebra.

Per-generation and instantaneous growth rates

What is the relationship between per-generation growth rates and the Malthusian parameter, the instantaneous rate of growth?

Let be the number of organisms of type at time , and let be the per-capita reproductive rate per generation. If counts generations, then

and

Now we wish to move to the case where is continuous and real-valued. As before,

but now

where the last simplification follows from L'Hôpital's rule. Explicitly, let . Then

The solution to the equation

is
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle n_i(t) = n_i(0) e^{t\ln R} = n_i(0) R^{t}.\!}
Note that the continuous case and the original discrete-generation case agree for all integer values of Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle t} . We can define the instantaneous growth rate Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle r = \ln R} for convenience.

Continuous rate of change

If two organisms grow at different rates, how do their proportions in the population change over time?

Let Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle r_1} and Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle r_2} be the instantaneous rates of increase of type 1 and type 2, respectively. Then

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle {dn_i(t) \over dt} = r_i n_i(t).}
With the total population size
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle n(t) = n_1(t) + n_2(t)\!}
we have the proportion of type 1
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle p(t) = {n_1(t) \over n(t)}}
Define the fitness advantage
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle s \equiv s_{12} = r_1 - r_2\!}
Given our interest in understanding the change in gene frequencies, our goal is to compute the rate of change of Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle p(t)} .
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle {\partial p(t) \over \partial t}} Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle = {\partial \over \partial t}\left({n_1(t) \over n(t)}\right)}
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle = {\partial n_1(t) \over \partial t}\left({1 \over n(t)}\right) + n_1(t){-1 \over n(t)^2}{\partial n(t) \over \partial t}}
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle = {\partial n_1(t) \over \partial t}\left({1 \over n(t)}\right) + n_1(t){-1 \over n(t)^2}\left({\partial n_1(t) \over \partial t} + {\partial n_2(t) \over \partial t}\right)}
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle = {r_1 n_1(t) \over n(t)} - {n_1(t) \over n(t)^2}\left(r_1 n_1(t) + r_2 n_2(t)\right)}
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle = {r_1 n_1(t) \over n(t)} - {n_1(t) \over n(t)^2}\left(r_1 n_1(t) + (r_1-s)(n(t)-n_1(t))\right)}
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle = {r_1 n_1(t) \over n(t)} - {n_1(t) \over n(t)^2}\left(r_1 n(t) -s n(t) + s n_1(t))\right)}
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle = {n_1(t) \over n(t)^2}\left(s n(t) - s n_1(t))\right)}
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle = s{n_1(t) \over n(t)}\left(1 - {n_1(t) \over n(t)}\right)}

This result says that the proportion of type 1, , changes most rapidly when and most slowly when is very close to 0 or 1.

Evolution is linear on a log-odds scale

The logit function , which takes , induces a more natural space for considering changes in frequencies. Rather than tracking the proportion of type 1 or 2, we instead track their log odds. In logit terms, with ,

This differential equation has the solution

showing that the log-odds of finding type 1 changes linearly in time, increasing if and decreasing if .

Diffusion approximation

Insert math here.

Statistical analysis of relative growth rates

We have three strains, , and , where is a reference strain. Strains and have fitness and . Define the selection coefficient as usual. We have data consisting of triples (number of generations, number of cells of type , number of cells of type ). We have data consisting of pairs (number of generations, ) where =number of cells of type and number of cells of type .

What is the best estimate, and error, on ?

Model

Assuming exponential growth,

Let .

Maximum-likelihood approach

Add text.

Bayesian approach

Add text.