# We've moved to http://drummondlab.org.

the drummond lab

## Introduction

Here I will treat some basic questions in population genetics. For personal reasons, I tend to include all the algebra.

## Per-generation and instantaneous growth rates

What is the relationship between per-generation growth rates and the Malthusian parameter, the instantaneous rate of growth?

Let ${\displaystyle n_{i}(t)}$ be the number of organisms of type ${\displaystyle i}$ at time ${\displaystyle t}$, and let ${\displaystyle R}$ be the per-capita reproductive rate per generation. If ${\displaystyle t}$ counts generations, then

${\displaystyle n_{i}(t+1)=n_{i}(t)R\!}$
and
${\displaystyle n_{i}(t)=n_{i}(0)R^{t}.\!}$

Now we wish to move to the case where ${\displaystyle t}$ is continuous and real-valued. As before,

${\displaystyle n_{i}(t+1)=n_{i}(t)R\!}$
but now
 ${\displaystyle n_{i}(t+\Delta t)\!}$ ${\displaystyle =n_{i}(t)R^{\Delta t}\!}$ ${\displaystyle n_{i}(t+\Delta t)-n_{i}(t)\!}$ ${\displaystyle =n_{i}(t)R^{\Delta t}-n_{i}(t)\!}$ ${\displaystyle {\frac {n_{i}(t+\Delta t)-n_{i}(t)}{\Delta t}}}$ ${\displaystyle ={\frac {n_{i}(t)R^{\Delta t}-n_{i}(t)}{\Delta t}}}$ ${\displaystyle {\frac {n_{i}(t+\Delta t)-n_{i}(t)}{\Delta t}}}$ ${\displaystyle =n_{i}(t){\frac {R^{\Delta t}-1}{\Delta t}}}$ ${\displaystyle \lim _{\Delta t\to 0}\left[{n_{i}(t+\Delta t)-n_{i}(t) \over \Delta t}\right]}$ ${\displaystyle =\lim _{\Delta t\to 0}\left[n_{i}(t){\frac {R^{\Delta t}-1}{\Delta t}}\right]}$ ${\displaystyle {\frac {dn_{i}(t)}{dt}}}$ ${\displaystyle =n_{i}(t)\lim _{\Delta t\to 0}\left[{\frac {R^{\Delta t}-1}{\Delta t}}\right]}$ ${\displaystyle {\frac {dn_{i}(t)}{dt}}}$ ${\displaystyle =n_{i}(t)\ln R\!}$

where the last simplification follows from L'Hôpital's rule. Explicitly, let ${\displaystyle \epsilon =\Delta t}$. Then

 ${\displaystyle \lim _{\Delta t\to 0}\left[{R^{\Delta t}-1 \over \Delta t}\right]}$ ${\displaystyle =\lim _{\epsilon \to 0}\left[{\frac {R^{\epsilon }-1}{\epsilon }}\right]}$ ${\displaystyle =\lim _{\epsilon \to 0}\left[{\frac {{\frac {d}{d\epsilon }}\left(R^{\epsilon }-1\right)}{{\frac {d}{d\epsilon }}\epsilon }}\right]}$ ${\displaystyle =\lim _{\epsilon \to 0}\left[{\frac {R^{\epsilon }\ln R}{1}}\right]}$ ${\displaystyle =\ln R\lim _{\epsilon \to 0}\left[R^{\epsilon }\right]}$ ${\displaystyle =\ln R\!}$

The solution to the equation

${\displaystyle {\frac {dn_{i}(t)}{dt}}=n_{i}(t)\ln R}$
is
${\displaystyle n_{i}(t)=n_{i}(0)e^{t\ln R}=n_{i}(0)R^{t}.\!}$
Note that the continuous case and the original discrete-generation case agree for all integer values of ${\displaystyle t}$. We can define the instantaneous growth rate ${\displaystyle r=\ln R}$ for convenience.

## Continuous rate of change

If two organisms grow at different rates, how do their proportions in the population change over time?

Let ${\displaystyle r_{1}}$ and ${\displaystyle r_{2}}$ be the instantaneous rates of increase of type 1 and type 2, respectively. Then

${\displaystyle {dn_{i}(t) \over dt}=r_{i}n_{i}(t).}$
With the total population size
${\displaystyle n(t)=n_{1}(t)+n_{2}(t)\!}$
we have the proportion of type 1
${\displaystyle p(t)={n_{1}(t) \over n(t)}}$
${\displaystyle s\equiv s_{12}=r_{1}-r_{2}\!}$
Given our interest in understanding the change in gene frequencies, our goal is to compute the rate of change of ${\displaystyle p(t)}$.
 ${\displaystyle {\partial p(t) \over \partial t}}$ ${\displaystyle ={\partial \over \partial t}\left({n_{1}(t) \over n(t)}\right)}$ ${\displaystyle ={\partial n_{1}(t) \over \partial t}\left({1 \over n(t)}\right)+n_{1}(t){-1 \over n(t)^{2}}{\partial n(t) \over \partial t}}$ ${\displaystyle ={\partial n_{1}(t) \over \partial t}\left({1 \over n(t)}\right)+n_{1}(t){-1 \over n(t)^{2}}\left({\partial n_{1}(t) \over \partial t}+{\partial n_{2}(t) \over \partial t}\right)}$ ${\displaystyle ={r_{1}n_{1}(t) \over n(t)}-{n_{1}(t) \over n(t)^{2}}\left(r_{1}n_{1}(t)+r_{2}n_{2}(t)\right)}$ ${\displaystyle ={r_{1}n_{1}(t) \over n(t)}-{n_{1}(t) \over n(t)^{2}}\left(r_{1}n_{1}(t)+(r_{1}-s)(n(t)-n_{1}(t))\right)}$ ${\displaystyle ={r_{1}n_{1}(t) \over n(t)}-{n_{1}(t) \over n(t)^{2}}\left(r_{1}n(t)-sn(t)+sn_{1}(t))\right)}$ ${\displaystyle ={n_{1}(t) \over n(t)^{2}}\left(sn(t)-sn_{1}(t))\right)}$ ${\displaystyle =s{n_{1}(t) \over n(t)}\left(1-{n_{1}(t) \over n(t)}\right)}$ ${\displaystyle =sp(t)(1-p(t))\!}$

This result says that the proportion of type 1 ${\displaystyle p}$ changes most rapidly when ${\displaystyle p=0.5}$ and most slowly when ${\displaystyle p}$ is very close to 0 or 1.

## Evolution is linear on a log-odds scale

The logit function ${\displaystyle \mathrm {logit} (p)=\ln {p \over 1-p}}$, which takes ${\displaystyle p\in [0,1]\to \mathbb {R} }$, induces a more natural space for considering changes in frequencies. Rather than tracking the proportion of type 1 or 2, we instead track their log odds. In logit terms, with ${\displaystyle L_{p}(t)\equiv \mathrm {logit} (p(t))\!}$,

 ${\displaystyle {\partial L_{p}(t) \over \partial t}}$ ${\displaystyle ={\partial \over \partial t}\left(\ln {p(t) \over 1-p(t)}\right)}$ ${\displaystyle ={\partial \over \partial t}\left(\ln {n_{1}(t) \over n_{2}(t)}\right)}$ ${\displaystyle ={\partial \over \partial t}\left(\ln {n_{1}(0) \over n_{2}(0)}e^{st}\right)}$ ${\displaystyle =s.\!}$

This differential equation ${\displaystyle L_{p}'(t)=s}$ has the solution

${\displaystyle L_{p}(t)=L_{p}(0)+st\!}$

showing that the log-odds of finding type 1 changes linearly in time, increasing if ${\displaystyle s>0}$ and decreasing if ${\displaystyle s<0}$.

## Diffusion approximation

Insert math here.

## Statistical analysis of relative growth rates

We have three strains, ${\displaystyle i}$, ${\displaystyle j}$ and ${\displaystyle r}$, where ${\displaystyle r}$ is a reference strain. Strains ${\displaystyle i}$ and ${\displaystyle j}$ have fitness ${\displaystyle w_{i}=e^{r_{i}}}$ and ${\displaystyle w_{j}=e^{r_{j}}}$. Define the selection coefficient ${\displaystyle s_{ij}=\ln {\frac {w_{i}}{w_{j}}}=r_{i}-r_{j}}$ as usual. We have data consisting of triples (number of generations, number of cells of type ${\displaystyle i}$, number of cells of type ${\displaystyle r}$).

What is the best estimate, and error, on ${\displaystyle s_{ij}}$?

### Model

Given Let ${\displaystyle \Pr(s_{ij}=t)={\mathcal {N}}(t;\mu _{ij},\sigma _{ij}^{2})}$.

### Maximum-likelihood approach

===Bayesian approach===