We've moved to http://drummondlab.org.

the drummond lab

Introduction

Here I will treat some basic questions in population genetics. For personal reasons, I tend to include all the algebra.

Per-generation and instantaneous growth rates

What is the relationship between per-generation growth rates and the Malthusian parameter, the instantaneous rate of growth?

Let $\displaystyle n_i(t)$ be the number of organisms of type $\displaystyle i$ at time $\displaystyle t$ , and let $\displaystyle R$ be the per-capita reproductive rate per generation. If $\displaystyle t$ counts generations, then

$\displaystyle n_i(t+1) = n_i(t)R\!$
and
$\displaystyle n_i(t) = n_i(0)R^t.\!$

Now we wish to move to the case where $\displaystyle t$ is continuous and real-valued. As before,

$\displaystyle n_i(t+1) = n_i(t)R\!$
but now
 $\displaystyle n_i(t+\Delta t)\!$ $\displaystyle =n_i(t)R^{\Delta t}\!$ $\displaystyle n_i(t+\Delta t) - n_i(t)\!$ $\displaystyle = n_i(t)R^{\Delta t} - n_i(t)\!$ $\displaystyle \frac{n_i(t+\Delta t) - n_i(t)}{\Delta t}$ $\displaystyle =\frac{n_i(t)R^{\Delta t} - n_i(t)}{\Delta t}$ $\displaystyle \frac{n_i(t+\Delta t) - n_i(t)}{\Delta t}$ $\displaystyle =n_i(t) \frac{R^{\Delta t} - 1}{\Delta t}$ $\displaystyle \lim_{\Delta t \to 0} \left[{n_i(t+\Delta t) - n_i(t) \over \Delta t}\right]$ $\displaystyle =\lim_{\Delta t \to 0} \left[ n_i(t) \frac{R^{\Delta t} - 1}{\Delta t}\right]$ $\displaystyle \frac{d n_i(t)}{dt}$ $\displaystyle =n_i(t) \lim_{\Delta t \to 0} \left[\frac{R^{\Delta t} - 1}{\Delta t}\right]$ $\displaystyle \frac{d n_i(t)}{dt}$ $\displaystyle =n_i(t) \ln R\!$

where the last simplification follows from L'Hôpital's rule. Explicitly, let $\displaystyle \epsilon=\Delta t$ . Then

 $\displaystyle \lim_{\Delta t \to 0} \left[{R^{\Delta t} - 1 \over \Delta t}\right]$ $\displaystyle = \lim_{\epsilon \to 0} \left[\frac{R^{\epsilon} - 1}{\epsilon}\right]$ $\displaystyle =\lim_{\epsilon \to 0} \left[\frac{\frac{d}{d\epsilon}\left(R^{\epsilon} - 1\right)}{\frac{d}{d\epsilon}\epsilon}\right]$ $\displaystyle =\lim_{\epsilon \to 0} \left[\frac{R^{\epsilon}\ln R}{1}\right]$ $\displaystyle =\ln R \lim_{\epsilon \to 0} \left[R^{\epsilon}\right]$ $\displaystyle =\ln R\!$

The solution to the equation

$\displaystyle \frac{d n_i(t)}{dt} = n_i(t) \ln R$
is
$\displaystyle n_i(t) = n_i(0) e^{t\ln R} = n_i(0) R^{t}.\!$
Note that the continuous case and the original discrete-generation case agree for all integer values of $\displaystyle t$ . We can define the instantaneous growth rate $\displaystyle r = \ln R$ for convenience.

Continuous rate of change

If two organisms grow at different rates, how do their proportions in the population change over time?

Let $\displaystyle r_1$ and $\displaystyle r_2$ be the instantaneous rates of increase of type 1 and type 2, respectively. Then

$\displaystyle {dn_i(t) \over dt} = r_i n_i(t).$
With the total population size
$\displaystyle n(t) = n_1(t) + n_2(t)\!$
we have the proportion of type 1
$\displaystyle p(t) = {n_1(t) \over n(t)}$
$\displaystyle s \equiv s_{12} = r_1 - r_2\!$
Given our interest in understanding the change in gene frequencies, our goal is to compute the rate of change of $\displaystyle p(t)$ .
 $\displaystyle {\partial p(t) \over \partial t}$ $\displaystyle = {\partial \over \partial t}\left({n_1(t) \over n(t)}\right)$ $\displaystyle = {\partial n_1(t) \over \partial t}\left({1 \over n(t)}\right) + n_1(t){-1 \over n(t)^2}{\partial n(t) \over \partial t}$ $\displaystyle = {\partial n_1(t) \over \partial t}\left({1 \over n(t)}\right) + n_1(t){-1 \over n(t)^2}\left({\partial n_1(t) \over \partial t} + {\partial n_2(t) \over \partial t}\right)$ $\displaystyle = {r_1 n_1(t) \over n(t)} - {n_1(t) \over n(t)^2}\left(r_1 n_1(t) + r_2 n_2(t)\right)$ $\displaystyle = {r_1 n_1(t) \over n(t)} - {n_1(t) \over n(t)^2}\left(r_1 n_1(t) + (r_1-s)(n(t)-n_1(t))\right)$ $\displaystyle = {r_1 n_1(t) \over n(t)} - {n_1(t) \over n(t)^2}\left(r_1 n(t) -s n(t) + s n_1(t))\right)$ $\displaystyle = {n_1(t) \over n(t)^2}\left(s n(t) - s n_1(t))\right)$ $\displaystyle = s{n_1(t) \over n(t)}\left(1 - {n_1(t) \over n(t)}\right)$ $\displaystyle = s p(t)(1-p(t))\!$

This result says that the proportion of type 1, $\displaystyle p$ , changes most rapidly when $\displaystyle p=0.5$ and most slowly when $\displaystyle p$ is very close to 0 or 1.

Evolution is linear on a log-odds scale

The logit function $\displaystyle \mathrm{logit} (p) = \ln {p \over 1-p}$ , which takes $\displaystyle p \in [0,1] \to \mathbb{R}$ , induces a more natural space for considering changes in frequencies. Rather than tracking the proportion of type 1 or 2, we instead track their log odds. In logit terms, with $\displaystyle L_p(t) \equiv \mathrm{logit} (p(t))\!$ ,

 $\displaystyle {\partial L_p(t) \over \partial t}$ $\displaystyle = {\partial \over \partial t}\left(\ln {p(t) \over 1-p(t)}\right)$ $\displaystyle = {\partial \over \partial t}\left(\ln {n_1(t) \over n_2(t)}\right)$ $\displaystyle = {\partial \over \partial t}\left(\ln {n_1(0) \over n_2(0)} e^{st}\right)$ $\displaystyle = s. \!$

This differential equation $\displaystyle L_p'(t) = s$ has the solution

$\displaystyle L_p(t) = L_p(0) + st\!$

showing that the log-odds of finding type 1 changes linearly in time, increasing if $\displaystyle s>0$ and decreasing if $\displaystyle s<0$ .

Diffusion approximation

Insert math here.

Statistical analysis of relative growth rates

We have three strains, $\displaystyle i$ , $\displaystyle j$ and $\displaystyle r$ , where $\displaystyle r$ is a reference strain. Strains $\displaystyle i$ and $\displaystyle j$ have fitness $\displaystyle w_i = e^{r_i}$ and $\displaystyle w_j=e^{r_j}$ . Define the selection coefficient $\displaystyle s_{ij} = \ln \frac{w_i}{w_j} = r_i - r_j$ as usual. We have data consisting of triples ($\displaystyle g=$ number of generations, $\displaystyle n_i=$ number of cells of type $\displaystyle i$ , $\displaystyle n_r=$ number of cells of type $\displaystyle r$ ). We have data consisting of pairs ($\displaystyle g=$ number of generations, $\displaystyle p_{ir}= n_i/n_r$ ) where $\displaystyle n_i$ =number of cells of type $\displaystyle i$ and $\displaystyle n_r=$ number of cells of type $\displaystyle r$ .

What is the best estimate, and error, on $\displaystyle s_{ij}$ ?

Model

Assuming exponential growth, $\displaystyle \ln p_{ir} =$

Let $\displaystyle \Pr(s_{ij}=t) = \mathcal{N}(t;\mu_{ij}, \sigma^2_{ij})$ .