Drummond:PopGen
Introduction
Here I will treat some basic questions in population genetics. For personal reasons, I tend to include all the algebra.
Per-generation and instantaneous growth rates
What is the relationship between per-generation growth rates and the Malthusian parameter, the instantaneous rate of growth?
Let [math]\displaystyle{ n_i(t) }[/math] be the number of organisms of type [math]\displaystyle{ i }[/math] at time [math]\displaystyle{ t }[/math], and let [math]\displaystyle{ R }[/math] be the per-capita reproductive rate per generation. If [math]\displaystyle{ t }[/math] counts generations, then
- [math]\displaystyle{ n_i(t+1) = n_i(t)R\! }[/math]
- [math]\displaystyle{ n_i(t) = n_i(0)R^t.\! }[/math]
Now we wish to move to the case where [math]\displaystyle{ t }[/math] is continuous and real-valued.
As before,
- [math]\displaystyle{ n_i(t+1) = n_i(t)R\! }[/math]
[math]\displaystyle{ n_i(t+\Delta t)\! }[/math] [math]\displaystyle{ =n_i(t)R^{\Delta t}\! }[/math] [math]\displaystyle{ n_i(t+\Delta t) - n_i(t)\! }[/math] [math]\displaystyle{ = n_i(t)R^{\Delta t} - n_i(t)\! }[/math] [math]\displaystyle{ \frac{n_i(t+\Delta t) - n_i(t)}{\Delta t} }[/math] [math]\displaystyle{ =\frac{n_i(t)R^{\Delta t} - n_i(t)}{\Delta t} }[/math] [math]\displaystyle{ \frac{n_i(t+\Delta t) - n_i(t)}{\Delta t} }[/math] [math]\displaystyle{ =n_i(t) \frac{R^{\Delta t} - 1}{\Delta t} }[/math] [math]\displaystyle{ \lim_{\Delta t \to 0} \left[{n_i(t+\Delta t) - n_i(t) \over \Delta t}\right] }[/math] [math]\displaystyle{ =\lim_{\Delta t \to 0} \left[ n_i(t) \frac{R^{\Delta t} - 1}{\Delta t}\right] }[/math] [math]\displaystyle{ \frac{d n_i(t)}{dt} }[/math] [math]\displaystyle{ =n_i(t) \lim_{\Delta t \to 0} \left[\frac{R^{\Delta t} - 1}{\Delta t}\right] }[/math] [math]\displaystyle{ \frac{d n_i(t)}{dt} }[/math] [math]\displaystyle{ =n_i(t) \ln R\! }[/math]
where the last simplification follows from L'Hôpital's rule. Explicitly, let [math]\displaystyle{ \epsilon=\Delta t }[/math]. Then
[math]\displaystyle{ \lim_{\Delta t \to 0} \left[{R^{\Delta t} - 1 \over \Delta t}\right] }[/math] [math]\displaystyle{ = \lim_{\epsilon \to 0} \left[\frac{R^{\epsilon} - 1}{\epsilon}\right] }[/math] [math]\displaystyle{ =\lim_{\epsilon \to 0} \left[\frac{\frac{d}{d\epsilon}\left(R^{\epsilon} - 1\right)}{\frac{d}{d\epsilon}\epsilon}\right] }[/math] [math]\displaystyle{ =\lim_{\epsilon \to 0} \left[\frac{R^{\epsilon}\ln R}{1}\right] }[/math] [math]\displaystyle{ =\ln R \lim_{\epsilon \to 0} \left[R^{\epsilon}\right] }[/math] [math]\displaystyle{ =\ln R\! }[/math]
The solution to the equation
- [math]\displaystyle{ \frac{d n_i(t)}{dt} = n_i(t) \ln R }[/math]
- [math]\displaystyle{ n_i(t) = n_i(0) e^{t\ln R} = n_i(0) R^{t}.\! }[/math]
Continuous rate of change
If two organisms grow at different rates, how do their proportions in the population change over time?
Let [math]\displaystyle{ r_1 }[/math] and [math]\displaystyle{ r_2 }[/math] be the instantaneous rates of increase of type 1 and type 2, respectively. Then
- [math]\displaystyle{ {dn_i(t) \over dt} = r_i n_i(t). }[/math]
- [math]\displaystyle{ n(t) = n_1(t) + n_2(t)\! }[/math]
- [math]\displaystyle{ p(t) = {n_1(t) \over n(t)} }[/math]
- [math]\displaystyle{ s \equiv s_{12} = r_1 - r_2\! }[/math]
[math]\displaystyle{ {\partial p(t) \over \partial t} }[/math] [math]\displaystyle{ = {\partial \over \partial t}\left({n_1(t) \over n(t)}\right) }[/math] [math]\displaystyle{ = {\partial n_1(t) \over \partial t}\left({1 \over n(t)}\right) + n_1(t){-1 \over n(t)^2}{\partial n(t) \over \partial t} }[/math] [math]\displaystyle{ = {\partial n_1(t) \over \partial t}\left({1 \over n(t)}\right) + n_1(t){-1 \over n(t)^2}\left({\partial n_1(t) \over \partial t} + {\partial n_2(t) \over \partial t}\right) }[/math] [math]\displaystyle{ = {r_1 n_1(t) \over n(t)} - {n_1(t) \over n(t)^2}\left(r_1 n_1(t) + r_2 n_2(t)\right) }[/math] [math]\displaystyle{ = {r_1 n_1(t) \over n(t)} - {n_1(t) \over n(t)^2}\left(r_1 n_1(t) + (r_1-s)(n(t)-n_1(t))\right) }[/math] [math]\displaystyle{ = {r_1 n_1(t) \over n(t)} - {n_1(t) \over n(t)^2}\left(r_1 n(t) -s n(t) + s n_1(t))\right) }[/math] [math]\displaystyle{ = {n_1(t) \over n(t)^2}\left(s n(t) - s n_1(t))\right) }[/math] [math]\displaystyle{ = s{n_1(t) \over n(t)}\left(1 - {n_1(t) \over n(t)}\right) }[/math] [math]\displaystyle{ = s p(t)(1-p(t))\! }[/math]
This result says that the proportion of type 1, [math]\displaystyle{ p }[/math], changes most rapidly when [math]\displaystyle{ p=0.5 }[/math] and most slowly when [math]\displaystyle{ p }[/math] is very close to 0 or 1.
Evolution is linear on a log-odds scale
The logit function [math]\displaystyle{ \mathrm{logit} (p) = \ln {p \over 1-p} }[/math], which takes [math]\displaystyle{ p \in [0,1] \to \mathbb{R} }[/math], induces a more natural space for considering changes in frequencies. Rather than tracking the proportion of type 1 or 2, we instead track their log odds. In logit terms, with [math]\displaystyle{ L_p(t) \equiv \mathrm{logit} (p(t))\! }[/math],
[math]\displaystyle{ {\partial L_p(t) \over \partial t} }[/math] [math]\displaystyle{ = {\partial \over \partial t}\left(\ln {p(t) \over 1-p(t)}\right) }[/math] [math]\displaystyle{ = {\partial \over \partial t}\left(\ln {n_1(t) \over n_2(t)}\right) }[/math] [math]\displaystyle{ = {\partial \over \partial t}\left(\ln {n_1(0) \over n_2(0)} e^{st}\right) }[/math] [math]\displaystyle{ = s. \! }[/math]
This differential equation [math]\displaystyle{ L_p'(t) = s }[/math] has the solution
- [math]\displaystyle{ L_p(t) = L_p(0) + st\! }[/math]
showing that the log-odds of finding type 1 changes linearly in time, increasing if [math]\displaystyle{ s\gt 0 }[/math] and decreasing if [math]\displaystyle{ s\lt 0 }[/math].
Diffusion approximation
Insert math here.
Statistical analysis of relative growth rates
We have three strains, [math]\displaystyle{ i }[/math], [math]\displaystyle{ j }[/math] and [math]\displaystyle{ r }[/math], where [math]\displaystyle{ r }[/math] is a reference strain. Strains [math]\displaystyle{ i }[/math] and [math]\displaystyle{ j }[/math] have fitness [math]\displaystyle{ w_i = e^{r_i} }[/math] and [math]\displaystyle{ w_j=e^{r_j} }[/math]. Define the selection coefficient [math]\displaystyle{ s_{ij} = \ln \frac{w_i}{w_j} = r_i - r_j }[/math] as usual. We have data consisting of triples ([math]\displaystyle{ g= }[/math]number of generations, [math]\displaystyle{ n_i= }[/math]number of cells of type [math]\displaystyle{ i }[/math], [math]\displaystyle{ n_r= }[/math]number of cells of type [math]\displaystyle{ r }[/math]). We have data consisting of pairs ([math]\displaystyle{ g= }[/math]number of generations, [math]\displaystyle{ p_{ir}= n_i/n_r }[/math]) where [math]\displaystyle{ n_i }[/math]=number of cells of type [math]\displaystyle{ i }[/math] and [math]\displaystyle{ n_r= }[/math]number of cells of type [math]\displaystyle{ r }[/math].
What is the best estimate, and error, on [math]\displaystyle{ s_{ij} }[/math]?
Model
Assuming exponential growth, [math]\displaystyle{ \ln p_{ir} = }[/math]
Let [math]\displaystyle{ \Pr(s_{ij}=t) = \mathcal{N}(t;\mu_{ij}, \sigma^2_{ij}) }[/math].
Maximum-likelihood approach
Add text.
Bayesian approach
Add text.