User:Hussein Alasadi/Notebook/stephens/2013/10/13: Difference between revisions

Latest revision as of 23:27, 26 September 2017

analyzing pooled sequenced data with selection

Main project page

Previous entry Next entry

Intro to Wen & Stephens in 2D

Suppose we have only summary-level data for haplotypes [math]\displaystyle{ h_1, h_2, ..., h_{2n} }[/math]. Specifically let the summary-level data be denoted by [math]\displaystyle{ y = (y_1, y_2)' = \frac{1}{2n} \sum_i^{2n} h_i }[/math]. We assume in this two locus model, the first locus is typed and the second locus is untyped. We hope to predict what the allele frequency is at the untyped SNP using information from panel data (perhaps this can be interpreted as our prior). Formally, let [math]\displaystyle{ y_1 }[/math] denote the allele frequency at the typed SNP and [math]\displaystyle{ y_2 }[/math] the allele frequency at the untyped SNP. We assume that [math]\displaystyle{ h_1, h_2, ..., h_{2n} }[/math] are independent and identically distributed from [math]\displaystyle{ P(M) }[/math] (our prior).

We assume that [math]\displaystyle{ y = N(\mu, \Sigma) }[/math]. By properties of bi-variate normal distributions [math]\displaystyle{ y_2/y_1,M }[/math] ~ [math]\displaystyle{ N(\mu_2 + \rho \frac{\sigma_2}{\sigma_1}(y_1 - u_1), (1-\rho^2)\sigma_1^2) }[/math] where [math]\displaystyle{ \rho = \frac{E[y_1y_2]}{\sigma_1 \sigma_2} }[/math]. The genius of Wen & Stephens lies in the idea that the distribution of [math]\displaystyle{ y_2 }[/math] (assigned as the untyped SNP) is a function of both the panel data ([math]\displaystyle{ \mu_2 }[/math]) and the typed SNPs [math]\displaystyle{ (y_1) }[/math].

Li & Stephens in 2D

We describe the Li & Stephens haplotype copying model: Let [math]\displaystyle{ h_1, h_2, ..., h_{k} }[/math] denote the k sampled haplotypes at 2 loci. Thus there are 4 possible haplotypes. The first haplotype is randomly chosen with equal probability from the four possible haplotypes.

Consider now the conditional distribution of [math]\displaystyle{ h_{k+1} }[/math] given [math]\displaystyle{ h_1, h_2,...,h_k }[/math]. Recall the intuition is that [math]\displaystyle{ h_{k+1} }[/math] is a mosaic of [math]\displaystyle{ h_1, h_2,..,h_k }[/math].

Let [math]\displaystyle{ X_j }[/math] denote which hapolotype [math]\displaystyle{ h_{k+1} }[/math] copies at site j (so [math]\displaystyle{ X_j \in {1,2,..,k} }[/math]).

We model [math]\displaystyle{ X_j }[/math] as a markov chain on [math]\displaystyle{ {1,..,k} }[/math] with [math]\displaystyle{ P(x_1 =x) = \frac{1}{k} }[/math]. The transition probabilities are:

[math]\displaystyle{ P(X_{j+1}=x'/X_j = x) = e^{-\frac{\rho_jd_j}{k}} + (1-e^{-\rho_jd_j})(1/k) }[/math] if [math]\displaystyle{ x'=x }[/math] and

[math]\displaystyle{ (1-e^{-\frac{\rho_jd_j}{k}})(1/k) }[/math] otherwise. [math]\displaystyle{ \rho_j }[/math] and [math]\displaystyle{ d_j }[/math] denote recombination and physical distances, respectively.

Now in a hidden markov model, there is also the transmission process. To mimic the effects of mutation, the copying process may be imperfect. [math]\displaystyle{ P(h_{k+1,j} =a / X_j = x, h_1,..,h_k) = \frac{k}{k+\theta} + \frac{\theta}{2(k+\theta)} }[/math] if [math]\displaystyle{ h_{x,j} = a }[/math] and [math]\displaystyle{ \frac{\theta}{2(k+\theta)} }[/math] otherwise. [math]\displaystyle{ \theta = (\sum_{m=1}^{n-1} \frac{1}{m})^{-1} }[/math], where the motivation is the more haplotypes the less frequent mutation occurs.

@@ Line 2: / Line 2: @@
 |-
 |style="background-color: #EEE"|[[Image:owwnotebook_icon.png|128px]]<span style="font-size:22px;"> analyzing pooled sequenced data with selection</span>
-|style="background-color: #F2F2F2" align="center"|<html><img src="/images/9/94/Report.png" border="0" /></html> [[{{#sub:{{FULLPAGENAME}}|0|-11}}|Main project page]]<br />{{#if:{{#lnpreventry:{{FULLPAGENAME}}}}|<html><img src="/images/c/c3/Resultset_previous.png" border="0" /></html>[[{{#lnpreventry:{{FULLPAGENAME}}}}{{!}}Previous entry]]<html>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</html>}}{{#if:{{#lnnextentry:{{FULLPAGENAME}}}}|[[{{#lnnextentry:{{FULLPAGENAME}}}}{{!}}Next entry]]<html><img src="/images/5/5c/Resultset_next.png" border="0" /></html>}}
+|style="background-color: #F2F2F2" align="center"|[[File:Report.png|frameless|link={{#sub:{{FULLPAGENAME}}|0|-11}}]][[{{#sub:{{FULLPAGENAME}}|0|-11}}|Main project page]]<br />{{#if:{{#lnpreventry:{{FULLPAGENAME}}}}|[[File:Resultset_previous.png|frameless|link={{#lnpreventry:{{FULLPAGENAME}}}}]][[{{#lnpreventry:{{FULLPAGENAME}}}}{{!}}Previous entry]]&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}}{{#if:{{#lnnextentry:{{FULLPAGENAME}}}}|[[{{#lnnextentry:{{FULLPAGENAME}}}}{{!}}Next entry]][[File:Resultset_next.png|frameless|link={{#lnnextentry:{{FULLPAGENAME}}}}]]}}
 |-
 | colspan="2"|
@@ Line 10: / Line 10: @@
-We assume that <math> y = N(\mu, \Sigma) </math>. By properties of bi-variate normal distributions <math>y_2/y_1,M</math> ~ <math>N(\mu_2 + \rho \frac{\sigma_2}{\sigma_1}(y_1 - u_1), (1-\rho^2)\sigma_1^2)</math> where <math>\rho = \frac{E[y_1y_2]}{\sigma_1 \sigma_2}</math> (which actually holds for any partition of <math> y </math> into <math> y_1, y_2 </math> for the multi-variate case). The genius of Wen & Stephens lies in the idea that the distribution of <math>y_2</math> (assign as the vector of untyped SNPs) is a function of both the panel data (<math>\mu_2</math>) and the typed SNPs <math>(y_1)</math>.
+We assume that <math> y = N(\mu, \Sigma) </math>. By properties of bi-variate normal distributions <math>y_2/y_1,M</math> ~ <math>N(\mu_2 + \rho \frac{\sigma_2}{\sigma_1}(y_1 - u_1), (1-\rho^2)\sigma_1^2)</math> where <math>\rho = \frac{E[y_1y_2]}{\sigma_1 \sigma_2}</math>. The genius of Wen & Stephens lies in the idea that the distribution of <math>y_2</math> (assigned as the untyped SNP) is a function of both the panel data (<math>\mu_2</math>) and the typed SNPs <math>(y_1)</math>.
 == Li & Stephens in 2D ==

User:Hussein Alasadi/Notebook/stephens/2013/10/13: Difference between revisions

Latest revision as of 23:27, 26 September 2017

Intro to Wen & Stephens in 2D

Li & Stephens in 2D

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

research

Tools