Difference between revisions of "User:Hussein Alasadi/Notebook/stephens/2013/10/13"

From OpenWetWare
Jump to: navigation, search
(Intro to Wen & Stephens in 2D)
(fix raw html notebook nav)
 
(4 intermediate revisions by one other user not shown)
Line 2: Line 2:
 
|-
 
|-
 
|style="background-color: #EEE"|[[Image:owwnotebook_icon.png|128px]]<span style="font-size:22px;"> analyzing pooled sequenced data with selection</span>
 
|style="background-color: #EEE"|[[Image:owwnotebook_icon.png|128px]]<span style="font-size:22px;"> analyzing pooled sequenced data with selection</span>
|style="background-color: #F2F2F2" align="center"|<html><img src="/images/9/94/Report.png" border="0" /></html> [[{{#sub:{{FULLPAGENAME}}|0|-11}}|Main project page]]<br />{{#if:{{#lnpreventry:{{FULLPAGENAME}}}}|<html><img src="/images/c/c3/Resultset_previous.png" border="0" /></html>[[{{#lnpreventry:{{FULLPAGENAME}}}}{{!}}Previous entry]]<html>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</html>}}{{#if:{{#lnnextentry:{{FULLPAGENAME}}}}|[[{{#lnnextentry:{{FULLPAGENAME}}}}{{!}}Next entry]]<html><img src="/images/5/5c/Resultset_next.png" border="0" /></html>}}
+
|style="background-color: #F2F2F2" align="center"|[[File:Report.png|frameless|link={{#sub:{{FULLPAGENAME}}|0|-11}}]][[{{#sub:{{FULLPAGENAME}}|0|-11}}|Main project page]]<br />{{#if:{{#lnpreventry:{{FULLPAGENAME}}}}|[[File:Resultset_previous.png|frameless|link={{#lnpreventry:{{FULLPAGENAME}}}}]][[{{#lnpreventry:{{FULLPAGENAME}}}}{{!}}Previous entry]]&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}}{{#if:{{#lnnextentry:{{FULLPAGENAME}}}}|[[{{#lnnextentry:{{FULLPAGENAME}}}}{{!}}Next entry]][[File:Resultset_next.png|frameless|link={{#lnnextentry:{{FULLPAGENAME}}}}]]}}
 
|-
 
|-
 
| colspan="2"|
 
| colspan="2"|
Line 10: Line 10:
  
  
We assume that <math> \vec{y} = N(\mu, \Sigma) </math>. By properties of bi-variate normal distributions <math>y_2/y_1,M</math> ~ <math>N(\mu_2 + \rho \frac{\sigma_2}{\sigma_1}(y_1 - u_1), (1-\rho^2)\sigma_1^2)</math> where <math>\rho = \frac{E[y_1y_2]}{\sigma_1 \sigma_2}</math> (for any partition of <math> \vec{y} </math>). The genius of Wen & Stephens lies in the idea that the distribution of <math>y_2</math> (assign as the vector of untyped SNPs) is a function of both the panel data (<math>\mu_2</math>) and the typed SNPs <math>(y_1)</math>.
+
We assume that <math> y = N(\mu, \Sigma) </math>. By properties of bi-variate normal distributions <math>y_2/y_1,M</math> ~ <math>N(\mu_2 + \rho \frac{\sigma_2}{\sigma_1}(y_1 - u_1), (1-\rho^2)\sigma_1^2)</math> where <math>\rho = \frac{E[y_1y_2]}{\sigma_1 \sigma_2}</math>. The genius of Wen & Stephens lies in the idea that the distribution of <math>y_2</math> (assigned as the untyped SNP) is a function of both the panel data (<math>\mu_2</math>) and the typed SNPs <math>(y_1)</math>.
  
 
== Li & Stephens in 2D ==
 
== Li & Stephens in 2D ==

Latest revision as of 22:27, 26 September 2017

Owwnotebook icon.png analyzing pooled sequenced data with selection Report.pngMain project page
Resultset previous.pngPrevious entry      Next entryResultset next.png

Intro to Wen & Stephens in 2D

Suppose we have only summary-level data for haplotypes Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle h_1, h_2, ..., h_{2n}} . Specifically let the summary-level data be denoted by Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle y = (y_1, y_2)' = \frac{1}{2n} \sum_i^{2n} h_i} . We assume in this two locus model, the first locus is typed and the second locus is untyped. We hope to predict what the allele frequency is at the untyped SNP using information from panel data (perhaps this can be interpreted as our prior). Formally, let Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle y_1} denote the allele frequency at the typed SNP and Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle y_2} the allele frequency at the untyped SNP. We assume that Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle h_1, h_2, ..., h_{2n}} are independent and identically distributed from Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle P(M)} (our prior).


We assume that Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle y = N(\mu, \Sigma) } . By properties of bi-variate normal distributions Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle y_2/y_1,M} ~ Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle N(\mu_2 + \rho \frac{\sigma_2}{\sigma_1}(y_1 - u_1), (1-\rho^2)\sigma_1^2)} where Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \rho = \frac{E[y_1y_2]}{\sigma_1 \sigma_2}} . The genius of Wen & Stephens lies in the idea that the distribution of Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle y_2} (assigned as the untyped SNP) is a function of both the panel data (Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mu_2} ) and the typed SNPs Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle (y_1)} .

Li & Stephens in 2D

We describe the Li & Stephens haplotype copying model: Let Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle h_1, h_2, ..., h_{k}} denote the k sampled haplotypes at 2 loci. Thus there are 4 possible haplotypes. The first haplotype is randomly chosen with equal probability from the four possible haplotypes.

Consider now the conditional distribution of Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle h_{k+1}} given Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle h_1, h_2,...,h_k} . Recall the intuition is that Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle h_{k+1}} is a mosaic of Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle h_1, h_2,..,h_k} .

Let Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle X_j} denote which hapolotype Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle h_{k+1}} copies at site j (so Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle X_j \in {1,2,..,k}} ).

We model Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle X_j} as a markov chain on Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle {1,..,k}} with Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle P(x_1 =x) = \frac{1}{k}} . The transition probabilities are:

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle P(X_{j+1}=x'/X_j = x) = e^{-\frac{\rho_jd_j}{k}} + (1-e^{-\rho_jd_j})(1/k)} if Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle x'=x} and

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle (1-e^{-\frac{\rho_jd_j}{k}})(1/k)} otherwise. Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \rho_j} and Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle d_j} denote recombination and physical distances, respectively.

Now in a hidden markov model, there is also the transmission process. To mimic the effects of mutation, the copying process may be imperfect. Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle P(h_{k+1,j} =a / X_j = x, h_1,..,h_k) = \frac{k}{k+\theta} + \frac{\theta}{2(k+\theta)}} if Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle h_{x,j} = a} and Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \frac{\theta}{2(k+\theta)}} otherwise. Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \theta = (\sum_{m=1}^{n-1} \frac{1}{m})^{-1}} , where the motivation is the more haplotypes the less frequent mutation occurs.