Difference between revisions of "User:Hussein Alasadi/Notebook/stephens/2013/10/03"
(→Notes from Meeting) 
(→Notes from Meeting) 

(22 intermediate revisions by the same user not shown)  
Line 36:  Line 36:  
We assume a prior on our vector of frequencies based on our panel of SNPs <math> (M) </math> of dimension <math> 2mxp </math>  We assume a prior on our vector of frequencies based on our panel of SNPs <math> (M) </math> of dimension <math> 2mxp </math>  
−  <math> \vec{f_{i,k}} </math> ~ <math> MVN(\mu, \  +  <math> \vec{f_{i,k}} </math> ~ <math> MVN(\mu, \Sigma) </math> 
<math> \mu = (1\theta)f^{panel} + \frac{\theta}{2} 1 </math>  <math> \mu = (1\theta)f^{panel} + \frac{\theta}{2} 1 </math>  
−  <math> \  +  <math> \Sigma = (1\theta)^2 S + \frac{\theta}{2}(1  \frac{\theta}{2})I </math> 
where <math> S_{i,j} = \sum_{i,j}^{panel}</math> if i = j or <math> e^{\frac{\rho_{i,j}}{2m} \sum_{i,j}^{panel}} </math> if i not equal to j  where <math> S_{i,j} = \sum_{i,j}^{panel}</math> if i = j or <math> e^{\frac{\rho_{i,j}}{2m} \sum_{i,j}^{panel}} </math> if i not equal to j  
Line 51:  Line 51:  
* '''conditional distribution'''  * '''conditional distribution'''  
−  <math> (f_{i,k,2}, .... , f_{i,k,p})  f_{i,k,1}, M </math> ~ <math> MVN(\bar{\mu}, \bar{\  +  <math> (f_{i,k,2}, .... , f_{i,k,p})  f_{i,k,1}, M </math> ~ <math> MVN(\bar{\mu}, \bar{\Sigma}) </math> 
−  The conditional distribution is easily obtained when we use a result derived [http://openwetware.org/wiki/User:Hussein_Alasadi/Notebook/stephens/2013/10/14 here].  +  The conditional distribution is easily obtained when we use a result derived [http://openwetware.org/wiki/User:Hussein_Alasadi/Notebook/stephens/2013/10/14 here]. 
+  
+  let <math> X_2 = (f_{i,k,2}, .... , f_{i,k,p}) </math> and <math> X_1 = f_{i,k,1} </math>  
+  
+  <math> X_2  X_1, M </math> ~ <math> N(\vec{\mu_2} + \Sigma_{21} \Sigma_{11}^{1} (x_1  \mu_1), \Sigma_{22}  \Sigma_{21}\Sigma_{11}^{1}\Sigma_{12}) </math>  
+  
+  Thus <math> \bar{\mu} = \vec{\mu_2} + \Sigma_{21} \Sigma_{11}^{1} (x_1  \mu_1), \bar{\Sigma} = \Sigma_{22}  \Sigma_{21}\Sigma_{11}^{1}\Sigma_{12} </math>  
+  
+  And equivalently we could derive the distribution <math> X_1  X_2, M </math> which is again <math> f_{i,k,1}  f_{i,k,2}, .... , f_{i,k,p}), M </math>  
+  
+  *'''Likelihood for frequency a the test SNP t given all data'''  
+  
+  let <math>f_{obs} = \prod_{j \not= t} f_{i,k,j} </math>  
+  
+  <math> L(f_{i,k,t}^{true}) = P(f_{obs}  f_{i,k,t}^{true}, M) = \frac{P( f_{i,k,t}^{true}  M, f_{obs}) P(f^{obs}M)}{P(f_{i,k,t}^{true}  M)}</math>  
+  
+  Confused here, can we just use the expression derived above for <math>P( f_{i,k,t}^{true}  M, f_{obs}) </math>. Also, isn't <math> f_{i,k,t}^{true}  M </math> ~  
+  <math> N(\mu_1, \Sigma_{11}) </math> and <math> f^{obs}  M </math> ~ <math> N(\mu_2, \Sigma_{22}) </math>. But, how do we then incorporate <math> \beta </math> into the likelihood calculation?  
+  
+  
+  But maybe we want to incorporate dispersion and measurement error parameters  
+  
+  Then:  
+  <math> f_{i,k,t}^{true}  M </math> ~ <math> N(\mu, \sigma^2 \Sigma) </math> The parameter <math> \sigma^2 </math> allows for overdispersion  
+  <math> f^{obs} M </math> ~ <math> N_{p1} (\mu_2, \sigma^2 \Sigma_{22} + \epsilon^2 I) </math> where <math> \epsilon^2 </math> allows for measurement error.  
+  
+  and I don't understand <math> f_{obs}  f_{i,k,t}^{true}, M </math>. Shouldn't it come from (2.12) and not (2.13)  ask Matthew  
+  
+  
+  
Revision as of 19:04, 16 October 2013
analyzing pooled sequenced data with selection  <html><img src="/images/9/94/Report.png" border="0" /></html> Main project page Next entry<html><img src="/images/5/5c/Resultset_next.png" border="0" /></html> 
Notes from MeetingConsider a single lineage for now. Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle X_j} = frequency of "1" allele at SNP j in the pool (i.e. the true frequency of the 1 allele in the pool)
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle (n_j^0, n_j^1) } = number of "0", "1" alleles at SNP j (Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle n_j = n_j^0 + n_j^1 } )
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle n_j^1} ~ Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle Bin(n_j, X_j) \approx N(n_jX_j, n_jX_j(1X_j))} Normal approximation to binomial Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \frac{n_j^1}{n_j} \approx N(X_j, \frac{X_j(1X_j)}{n_j}) } The variance of this distribution results from error due to binomial sampling. To simplify, we just plug in Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \hat{X_j} = \frac{n_j^1}{n_j}} for Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle X_j } Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \implies \frac{n_j^1}{n_j}  X_j \approx N(X_j, \frac{\hat{X_j}(1\hat{X_j})}{n_j}) }
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle f_{i,k,j} = } frequency of reference allele in group i, replicate and SNP j. Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \vec{f_{i,k}} = } vector of frequencies Without loss of generality, we assume that the putative selected site is site Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle j = 1 }
We assume a prior on our vector of frequencies based on our panel of SNPs Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle (M) } of dimension Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle 2mxp } Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \vec{f_{i,k}} } ~ Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle MVN(\mu, \Sigma) } Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mu = (1\theta)f^{panel} + \frac{\theta}{2} 1 } Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \Sigma = (1\theta)^2 S + \frac{\theta}{2}(1  \frac{\theta}{2})I } where Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle S_{i,j} = \sum_{i,j}^{panel}} if i = j or Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle e^{\frac{\rho_{i,j}}{2m} \sum_{i,j}^{panel}} } if i not equal to j Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \theta = \frac{(\sum_{i=1}^{2m1} \frac{1}{i})^{1}}{2m + (\sum_{i=1}^{2m1} \frac{1}{i})^{1}} }
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle log \frac{f_{i,k,1}}{1f_{i,k,1}} = \mu + \beta g_i + \epsilon_{i,k} }
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle (f_{i,k,2}, .... , f_{i,k,p})  f_{i,k,1}, M } ~ Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle MVN(\bar{\mu}, \bar{\Sigma}) } The conditional distribution is easily obtained when we use a result derived here. let Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle X_2 = (f_{i,k,2}, .... , f_{i,k,p}) } and Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle X_1 = f_{i,k,1} } Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle X_2  X_1, M } ~ Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle N(\vec{\mu_2} + \Sigma_{21} \Sigma_{11}^{1} (x_1  \mu_1), \Sigma_{22}  \Sigma_{21}\Sigma_{11}^{1}\Sigma_{12}) } Thus Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \bar{\mu} = \vec{\mu_2} + \Sigma_{21} \Sigma_{11}^{1} (x_1  \mu_1), \bar{\Sigma} = \Sigma_{22}  \Sigma_{21}\Sigma_{11}^{1}\Sigma_{12} } And equivalently we could derive the distribution Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle X_1  X_2, M } which is again Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle f_{i,k,1}  f_{i,k,2}, .... , f_{i,k,p}), M }
let Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle f_{obs} = \prod_{j \not= t} f_{i,k,j} } Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle L(f_{i,k,t}^{true}) = P(f_{obs}  f_{i,k,t}^{true}, M) = \frac{P( f_{i,k,t}^{true}  M, f_{obs}) P(f^{obs}M)}{P(f_{i,k,t}^{true}  M)}} Confused here, can we just use the expression derived above for Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle P( f_{i,k,t}^{true}  M, f_{obs}) } . Also, isn't Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle f_{i,k,t}^{true}  M } ~ Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle N(\mu_1, \Sigma_{11}) } and Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle f^{obs}  M } ~ Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle N(\mu_2, \Sigma_{22}) } . But, how do we then incorporate Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \beta } into the likelihood calculation?
Then: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle f_{i,k,t}^{true}  M } ~ Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle N(\mu, \sigma^2 \Sigma) } The parameter Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \sigma^2 } allows for overdispersion Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle f^{obs} M } ~ Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle N_{p1} (\mu_2, \sigma^2 \Sigma_{22} + \epsilon^2 I) } where Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \epsilon^2 } allows for measurement error. and I don't understand Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle f_{obs}  f_{i,k,t}^{true}, M } . Shouldn't it come from (2.12) and not (2.13)  ask Matthew
