User:Timothee Flutre/Notebook/Postdoc/2011/11/10
Project name  <html><img src="/images/9/94/Report.png" border="0" /></html> Main project page <html><img src="/images/c/c3/Resultset_previous.png" border="0" /></html>Previous entry<html> </html>Next entry<html><img src="/images/5/5c/Resultset_next.png" border="0" /></html> 
Bayesian model of univariate linear regression for QTL detectionThis page aims at helping people like me, interested in quantitative genetics, to get a better understanding of some Bayesian models, most importantly the impact of the modeling assumptions as well as the underlying maths. It starts with a simple model, and gradually increases the scope to relax assumptions. See references to scientific articles at the end.
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \forall i \in \{1,\ldots,N\}, \; y_i = \mu + \beta_1 g_i + \beta_2 \mathbf{1}_{g_i=1} + \epsilon_i \; \text{ with } \; \epsilon_i \; \overset{i.i.d}{\sim} \; \mathcal{N}(0,\tau^{1})} where Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \beta_1} is in fact the additive effect of the SNP, noted Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle a} from now on, and Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \beta_2} is the dominance effect of the SNP, Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle d = a k} . Let's now write the model in vectormatrix notation: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle Y = X B + E \text{ where } B = [ \mu \; a \; d ]^T} This gives the following multivariate Normal distribution for the phenotypes: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle Y  X, \tau, B \sim \mathcal{N}(XB, \tau^{1} I_N)} Even though we can write the likelihood as a multivariate Normal, I still keep the term "univariate" in the title because the regression has a single response, Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle Y} . It is usual to keep the term "multivariate" for the case where there is a matrix of responses (i.e. multiple phenotypes). The likelihood of the parameters given the data is therefore: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathcal{L}(\tau, B) = \mathsf{P}(Y  X, \tau, B)} Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathcal{L}(\tau, B) = \left(\frac{\tau}{2 \pi}\right)^{\frac{N}{2}} exp \left( \frac{\tau}{2} (Y  XB)^T (Y  XB) \right)}
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathsf{P}(\tau, B) = \mathsf{P}(\tau) \mathsf{P}(B  \tau)} A Gamma distribution for Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \tau} : Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \tau \sim \Gamma(\kappa/2, \, \lambda/2)} which means: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathsf{P}(\tau) = \frac{\frac{\lambda}{2}^{\kappa/2}}{\Gamma(\frac{\kappa}{2})} \tau^{\frac{\kappa}{2}1} e^{\frac{\lambda}{2} \tau}} And a multivariate Normal distribution for Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle B} : Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle B  \tau \sim \mathcal{N}(\vec{0}, \, \tau^{1} \Sigma_B) \text{ with } \Sigma_B = diag(\sigma_{\mu}^2, \sigma_a^2, \sigma_d^2)} which means: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathsf{P}(B  \tau) = \left(\frac{\tau}{2 \pi}\right)^{\frac{3}{2}} \Sigma_B^{\frac{1}{2}} exp \left(\frac{\tau}{2} B^T \Sigma_B^{1} B \right)}
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathsf{P}(\tau, B  Y, X) = \mathsf{P}(\tau  Y, X) \mathsf{P}(B  Y, X, \tau)}
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathsf{P}(B  Y, X, \tau) = \frac{\mathsf{P}(B, Y  X, \tau)}{\mathsf{P}(Y  X, \tau)}} Let's neglect the normalization constant for now: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathsf{P}(B  Y, X, \tau) \propto \mathsf{P}(B  \tau) \mathsf{P}(Y  X, \tau, B)} Similarly, let's keep only the terms in Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle B} for the moment: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathsf{P}(B  Y, X, \tau) \propto exp(B^T \Sigma_B^{1} B) exp((YXB)^T(YXB))} We expand: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathsf{P}(B  Y, X, \tau) \propto exp(B^T \Sigma_B^{1} B  Y^TXB B^TX^TY + B^TX^TXB)} We factorize some terms: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathsf{P}(B  Y, X, \tau) \propto exp(B^T (\Sigma_B^{1} + X^TX) B  Y^TXB B^TX^TY)} Importantly, let's define: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \Omega = (\Sigma_B^{1} + X^TX)^{1}} We can see that Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \Omega^T=\Omega} , which means that Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \Omega} is a symmetric matrix. This is particularly useful here because we can use the following equality: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \Omega^{1}\Omega^T=I} . Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathsf{P}(B  Y, X, \tau) \propto exp(B^T \Omega^{1} B  (X^TY)^T\Omega^{1}\Omega^TB B^T\Omega^{1}\Omega^TX^TY)} This now becomes easy to factorizes totally (remember to add the term in Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle Y} to complete the square): Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathsf{P}(B  Y, X, \tau) \propto exp((B  \Omega X^TY)^T\Omega^{1}(B  \Omega X^TY))} We recognize the kernel of a Normal distribution, allowing us to write the conditional posterior as: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle B  Y, X, \tau \sim \mathcal{N}(\Omega X^TY, \tau^{1} \Omega)}
Similarly to the equations above: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathsf{P}(\tau  Y, X) \propto \mathsf{P}(\tau) \mathsf{P}(Y  X, \tau)} But now, to handle the second term, we need to integrate over Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle B} , thus effectively taking into account the uncertainty in Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle B} : Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathsf{P}(\tau  Y, X) \propto \mathsf{P}(\tau) \int \mathsf{P}(B  \tau) \mathsf{P}(Y  X, \tau, B) \mathsf{d}B} Again, we use the priors and likelihoods specified above (but everything inside the integral is kept inside it, even if it doesn't depend on Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle B} !): Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathsf{P}(\tau  Y, X) \propto \tau^{\frac{\kappa}{2}  1} e^{\frac{\lambda}{2} \tau} \int \tau^{3/2} \tau^{N/2} exp(\frac{\tau}{2} B^T \Sigma_B^{1} B) exp(\frac{\tau}{2} (Y  XB)^T (Y  XB)) \mathsf{d}B} As we used a conjugate prior for Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \tau} , we know that we expect a Gamma distribution for the posterior. Therefore, we can take Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \tau^{N/2}} out of the integral and start guessing what looks like a Gamma distribution. We also factorize inside the exponential: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathsf{P}(\tau  Y, X) \propto \tau^{\frac{N+\kappa}{2}  1} e^{\frac{\lambda}{2} \tau} \int \tau^{3/2} exp \left[\frac{\tau}{2} \left( (B  \Omega X^T Y)^T \Omega^{1} (B  \Omega X^T Y)  Y^T X \Omega X^T Y + Y^T Y \right) \right] \mathsf{d}B} We recognize the conditional posterior of Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle B} . This allows us to use the fact that the pdf of the Normal distribution integrates to one: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathsf{P}(\tau  Y, X) \propto \tau^{\frac{N+\kappa}{2}  1} e^{\frac{\lambda}{2} \tau} exp\left[\frac{\tau}{2} (Y^T Y  Y^T X \Omega X^T Y) \right]} We finally recognize a Gamma distribution, allowing us to write the posterior as: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \tau  Y, X \sim \Gamma \left( \frac{N+\kappa}{2}, \; \frac{1}{2} (Y^T Y  Y^T X \Omega X^T Y + \lambda) \right)}
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle B, \tau  Y, X \sim \mathcal{N}IG(\Omega X^TY, \; \tau^{1}\Omega, \; \frac{N+\kappa}{2}, \; \frac{\lambda^\ast}{2})} where Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \lambda^\ast = Y^T Y  Y^T X \Omega X^T Y + \lambda}
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathsf{P}(B  Y, X) = \int \mathsf{P}(\tau) \mathsf{P}(B  Y, X, \tau) \mathsf{d}\tau} Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathsf{P}(B  Y, X) = \frac{\frac{\lambda^\ast}{2}^{\frac{N+\kappa}{2}}}{(2\pi)^\frac{3}{2} \Omega^{\frac{1}{2}} \Gamma(\frac{N+\kappa}{2})} \int \tau^{\frac{N+\kappa+3}{2}1} exp \left[\tau \left( \frac{\lambda^\ast}{2} + (B  \Omega X^TY)^T \Omega^{1} (B  \Omega X^TY) \right) \right] \mathsf{d}\tau} Here we recognize the formula to integrate the Gamma function: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathsf{P}(B  Y, X) = \frac{\frac{\lambda^\ast}{2}^{\frac{N+\kappa}{2}} \Gamma(\frac{N+\kappa+3}{2})}{(2\pi)^\frac{3}{2} \Omega^{\frac{1}{2}} \Gamma(\frac{N+\kappa}{2})} \left( \frac{\lambda^\ast}{2} + (B  \Omega X^TY)^T \Omega^{1} (B  \Omega X^TY) \right)^{\frac{N+\kappa+3}{2}}} And we now recognize a multivariate Student's tdistribution: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathsf{P}(B  Y, X) = \frac{\Gamma(\frac{N+\kappa+3}{2})}{\Gamma(\frac{N+\kappa}{2}) \pi^\frac{3}{2} \lambda^\ast \Omega^{\frac{1}{2}} } \left( 1 + \frac{(B  \Omega X^TY)^T \Omega^{1} (B  \Omega X^TY)}{\lambda^\ast} \right)^{\frac{N+\kappa+3}{2}}} We hence can write: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle B  Y, X \sim \mathcal{S}_{N+\kappa}(\Omega X^TY, \; (Y^T Y  Y^T X \Omega X^T Y + \lambda) \Omega)}
We want to test the following null hypothesis: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle H_0: \; a = d = 0} In Bayesian modeling, hypothesis testing is performed with a Bayes factor, which in our case can be written as: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathrm{BF} = \frac{\mathsf{P}(Y  X, a \neq 0, d \neq 0)}{\mathsf{P}(Y  X, a = 0, d = 0)}} We can shorten this into: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathrm{BF} = \frac{\mathsf{P}(Y  X)}{\mathsf{P}_0(Y)}} Note that, compare to frequentist hypothesis testing which focuses on the null, the Bayes factor requires to explicitly model the data under the alternative. This makes a big difference when interpreting the results (see below). Let's start with the numerator: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathsf{P}(Y  X) = \int \mathsf{P}(\tau) \mathsf{P}(Y  X, \tau) \mathsf{d}\tau} First, let's calculate what is inside the integral: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathsf{P}(Y  X, \tau) = \frac{\mathsf{P}(B  \tau) \mathsf{P}(Y  X, \tau, B)}{\mathsf{P}(B  Y, X, \tau)}} Using the formula obtained previously and doing some algebra gives: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathsf{P}(Y  X, \tau) = \left( \frac{\tau}{2 \pi} \right)^{\frac{N}{2}} \left( \frac{\Omega}{\Sigma_B} \right)^{\frac{1}{2}} exp\left( \frac{\tau}{2} (Y^TY  Y^TX\Omega X^TY) \right)} Now we can integrate out Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \tau} (note the small typo in equation 9 of supplementary text S1 of Servin & Stephens): Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathsf{P}(Y  X) = (2\pi)^{\frac{N}{2}} \left( \frac{\Omega}{\Sigma_B} \right)^{\frac{1}{2}} \frac{\frac{\lambda}{2}^{\frac{\kappa}{2}}}{\Gamma(\frac{\kappa}{2})} \int \tau^{\frac{N+\kappa}{2}1} exp \left( \frac{\tau}{2} (Y^TY  Y^TX\Omega X^TY + \lambda) \right) \mathsf{d}\tau} Inside the integral, we recognize the almostcomplete pdf of a Gamma distribution. As it has to integrate to one, we get: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathsf{P}(Y  X) = (2\pi)^{\frac{N}{2}} \left( \frac{\Omega}{\Sigma_B} \right)^{\frac{1}{2}} \left( \frac{\lambda}{2} \right)^{\frac{\kappa}{2}} \frac{\Gamma(\frac{N+\kappa}{2})}{\Gamma(\frac{\kappa}{2})} \left( \frac{Y^TY  Y^TX\Omega X^TY + \lambda}{2} \right)^{\frac{N+\kappa}{2}}} We can use this expression also under the null. In this case, as we need neither Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle a} nor Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle d} , Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle B} is simply Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mu} , Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \Sigma_B} is Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \sigma_{\mu}^2} and Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle X} is a vector of 1's. We can also defines Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \Omega_0 = ((\sigma_{\mu}^2)^{1} + N)^{1}} . In the end, this gives: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathsf{P}_0(Y) = (2\pi)^{\frac{N}{2}} \frac{\Omega_0^{\frac{1}{2}}}{\sigma_{\mu}} \left( \frac{\lambda}{2} \right)^{\frac{\kappa}{2}} \frac{\Gamma(\frac{N+\kappa}{2})}{\Gamma(\frac{\kappa}{2})} \left( \frac{Y^TY  \Omega_0 N^2 \bar{Y}^2 + \lambda}{2} \right)^{\frac{N+\kappa}{2}}} We can therefore write the Bayes factor: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathrm{BF} = \left( \frac{\Omega}{\Omega_0} \right)^{\frac{1}{2}} \frac{1}{\sigma_a \sigma_d} \left( \frac{Y^TY  Y^TX\Omega X^TY + \lambda}{Y^TY  \Omega_0 N^2 \bar{Y}^2 + \lambda} \right)^{\frac{N+\kappa}{2}}} When the Bayes factor is large, we say that there is enough evidence in the data to support the alternative. Indeed, the Bayesian testing procedure corresponds to measuring support for the specific alternative hypothesis compared to the null hypothesis. Importantly, note that, for a frequentist testing procedure, we would say that there is enough evidence in the data to reject the null. However we wouldn't say anything about the alternative as we don't model it. The threshold to say that a Bayes factor is large depends on the field. It is possible to use the Bayes factor as a test statistic when doing permutation testing, and then control the false discovery rate. This can give an idea of a reasonable threshold.
Such a question is never easy to answer. But note that all hyperparameters are not that important, especially in typical quantitative genetics applications. For instance, we are mostly interested in those that determine the magnitude of the effects, Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \sigma_a} and Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \sigma_d} , so let's deal with the others first. As explained in Servin & Stephens, the posteriors for Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \tau} and Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle B} change appropriately with shifts (Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle y+c} ) and scaling (Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle y \times c} ) in the phenotype when taking their limits. This also gives us a new Bayes factor, the one used in practice (see Guan & Stephens, 2008): Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathrm{lim}_{\sigma_{\mu} \rightarrow \infty \; ; \; \lambda \rightarrow 0 \; ; \; \kappa \rightarrow 0 } \; \mathrm{BF} = \left( \frac{N}{\Sigma_B^{1} + X^TX} \right)^{\frac{1}{2}} \frac{1}{\sigma_a \sigma_d} \left( \frac{Y^TY  Y^TX (\Sigma_B^{1} + X^TX)^{1} X^TY}{Y^TY  N \bar{Y}^2} \right)^{\frac{N}{2}}} Now, for the important hyperparameters, Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \sigma_a} and Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \sigma_d} , it is usual to specify a grid of values, i.e. Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle M} pairs Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle (\sigma_a, \sigma_d)} . For instance, Guan & Stephens used the following grid: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle M=4 \; ; \; \sigma_a \in \{0.05, 0.1, 0.2, 0.4\} \; ; \; \sigma_d = \frac{\sigma_a}{4}} Then, we can average the Bayes factors obtained over the grid using, as a first approximation, equal weights: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathrm{BF} = \sum_{m \, \in \, \text{grid}} \frac{1}{M} \, \mathrm{BF}(\sigma_a^{(m)}, \sigma_d^{(m)})} In eQTL studies, the weights can be estimated from the data using a hierarchical model (see below), by pooling all genes together as in Veyrieras et al (PLoS Genetics, 2010).
BF < function(G=NULL, Y=NULL, sigma.a=NULL, sigma.d=NULL, get.log10=TRUE){ stopifnot(! is.null(G), ! is.null(Y), ! is.null(sigma.a), ! is.null(sigma.d)) subset < complete.cases(Y) & complete.cases(G) Y < Y[subset] G < G[subset] stopifnot(length(Y) == length(G)) N < length(G) X < cbind(rep(1,N), G, G == 1) inv.Sigma.B < diag(c(0, 1/sigma.a^2, 1/sigma.d^2)) inv.Omega < inv.Sigma.B + t(X) %*% X inv.Omega0 < N tY.Y < t(Y) %*% Y log10.BF < as.numeric(0.5 * log10(inv.Omega0)  0.5 * log10(det(inv.Omega))  log10(sigma.a)  log10(sigma.d)  (N/2) * (log10(tY.Y  t(Y) %*% X %*% solve(inv.Omega) %*% t(X) %*% cbind(Y))  log10(tY.Y  N*mean(Y)^2))) if(get.log10) return(log10.BF) else return(10^log10.BF) } In the same vein as what is explained here, we can simulate data under different scenarios and check the BFs: N < 300 # play with it PVE < 0.1 # play with it grid < c(0.05, 0.1, 0.2, 0.4, 0.8, 1.6, 3.2) MAF < 0.3 G < rbinom(n=N, size=2, prob=MAF) tau < 1 a < sqrt((2/5) * (PVE / (tau * MAF * (1MAF) * (1PVE)))) d < a / 2 mu < rnorm(n=1, mean=0, sd=10) Y < mu + a * G + d * (G == 1) + rnorm(n=N, mean=0, sd=tau) for(m in 1:length(grid)) print(BF(G, Y, grid[m], grid[m]/4))
There are many equivalent ways to write the likelihood, the usual one being: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle y_i  p_i \; \overset{i.i.d}{\sim} \; \mathrm{Binomial}(1,p_i)} with the logodds (logit function) being Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathrm{ln} \frac{p_i}{1  p_i} = \mu + a \, g_i + d \, \mathbf{1}_{g_i=1}} Let's use Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle X_i^T=(1 \; g_i \; \mathbf{1}_{g_i=1})} to denote the Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle i} th row of the design matrix Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle X} . We can also keep the same definition as above for Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle B=(\mu \; a \; d)^T} . Thus we have: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle p_i = \frac{e^{X_i^TB}}{1 + e^{X_i^TB}}} As the Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle y_i} 's can only take Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle 0} and Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle 1} as values, the likelihood can be written as: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathcal{L}(B) = \mathsf{P}(Y  X, B) = \prod_{i=1}^N p_i^{y_i} (1p_i)^{1y_i}} We still use the same prior as above for Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle B} (but there is no Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \tau} anymore), so that: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle B  \Sigma_B \sim \mathcal{N}_3(0, \Sigma_B)} where Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \Sigma_B} is a 3 x 3 matrix with Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle (\sigma_\mu^2 \; \sigma_a^2 \; \sigma_d^2)} on the diagonal and 0 elsewhere. As above, the Bayes factor is used to compare the two models: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathrm{BF} = \frac{\mathsf{P}(Y  X, M1)}{\mathsf{P}(Y  X, M0)} = \frac{\mathsf{P}(Y  X, a \neq 0, d \neq 0)}{\mathsf{P}(Y  X, a=0, d=0)} = \frac{\int \mathsf{P}(B) \mathsf{P}(Y  X, B) \mathrm{d}B}{\int \mathsf{P}(\mu) \mathsf{P}(Y  X, \mu) \mathrm{d}\mu}} The interesting point here is that there is no way to analytically calculate these integrals (marginal likelihoods). Therefore, we will use Laplace's method to approximate them, as in Guan & Stephens (2008). Starting with the numerator: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathsf{P}(YX,M1) = \int \exp \left[ N \left( \frac{1}{N} \mathrm{ln} \, \mathsf{P}(B) + \frac{1}{N} \mathrm{ln} \, \mathsf{P}(Y  X, B) \right) \right] \mathsf{d}B} Let's use Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle f} to denote the function inside the exponential: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathsf{P}(YX,M1) = \int \exp \left( N \; f(B) \right) \mathsf{d}B} The function Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle f} is defined by: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle f: \mathbb{R}^3 \rightarrow \mathbb{R}} Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle f(B) = \frac{1}{N} \left( \frac{3}{2} \mathrm{ln}(2 \pi)  \frac{1}{2} \mathrm{ln}(\Sigma_B)  \frac{1}{2}(B^T \Sigma_B^{1} B) \right) + \frac{1}{N} \sum_{i=1}^N \left( y_i \mathrm{ln}(p_i) + (1y_i) \mathrm{ln}(1  p_i) \right)} This function will then be used to approximate the integral, like this: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathsf{P}(YX,M1) \approx N^{3/2} (2 \pi)^{3/2} H(B^\star)^{1/2} e^{N f(B^\star)}} where Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle H} is the Hessian of Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle f} and Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle B^\star = (\mu^\star a^\star d^\star)^T} is the point at which Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle f} is maximized. We therefore need two things: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle H} and Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle B^\star} . Note that for both we need to calculate the first derivatives of Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle f} . As Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle f} is multidimensional (it takes values in Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \mathbb{R}^3} ), we need to calculate its gradient. In the following, some formulas from matrix calculus are sometimes required. In such cases, I will use the numerator layout. Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \nabla f = \frac{\partial f}{\partial B} = \left( \frac{\partial f}{\partial \mu} \; \frac{\partial f}{\partial a} \; \frac{\partial f}{\partial d} \right)} Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \nabla f =  \frac{1}{2N} \frac{\partial B^T\Sigma_B^{1}B}{\partial B} + \frac{1}{N} \sum_i \left( y_i \frac{\partial \mathrm{ln}(p_i)}{\partial B} + (1y_i) \frac{\partial \mathrm{ln}(1p_i)}{\partial B} \right)} Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \nabla f =  \frac{1}{N} B^T\Sigma_B^{1} + \frac{1}{N} \sum_i \left( \frac{y_i}{p_i}  \frac{1y_i}{1p_i} \right) \frac{\partial p_i}{\partial B}} A simple form for the first derivatives of Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle p_i} also exists when writing Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle p_i = e^{X_i^tB} (1 + e^{X_i^tB})^{1}} : Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \frac{\partial p_i}{\partial B} = \frac{\partial e^{X_i^TB}}{\partial B} (1 + e^{X_i^TB})^{1} + e^{X_i^TB} \frac{\partial (1 + e^{X_i^TB})^{1}}{\partial B}} Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \frac{\partial p_i}{\partial B} = e^{X_i^TB} \frac{\partial X_i^TB}{\partial B} (1 + e^{X_i^TB})^{1}  e^{X_i^TB} (1 + e^{X_i^TB})^{2} \frac{\partial (1 + e^{X_i^TB})}{\partial B}} Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \frac{\partial p_i}{\partial B} = p_i X_i^T  p_i (1 + e^{X_i^TB})^{1} e^{X_i^TB} \frac{\partial X_i^TB}{\partial B}} Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \frac{\partial p_i}{\partial B} = p_i (1  p_i) X_i^T} This simplifies the gradient of Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle f} into: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \nabla f =  \frac{1}{N} B^T\Sigma_B^{1} + \frac{1}{N} \sum_i (y_i  p_i) X_i^T} To find Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle B^\star} , we set Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \nabla f(B^\star) = 0} . However, in this equation, Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle B^\star} is present not only alone but also in the Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle p_i} 's. As Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle p_i} is a nonlinear function of Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle B} , the equation can't be solved directly but an iterative procedure is required, typically a conjugate gradient method (as in Guan & Stephens) or Newton's method. The former only requires Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle f} and Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \nabla f} while the latter also requires Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle H} . Remember that, in any case, we need Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle H} for the Laplace approximation, so let's calculate it: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle H =  \frac{1}{N} \Sigma_B^{1}  \frac{1}{N} \sum_i \frac{\partial p_i}{\partial B} X_i^T} Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle H =  \frac{1}{N} \Sigma_B^{1}  \frac{1}{N} \sum_i X_i^T p_i (1p_i) X_i} Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle H =  \frac{1}{N} (\Sigma_B^{1} + X^T W X)} where Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle W} is the N x N matrix with Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle p_i(1p_i)} on the diagonal. Note that all second derivatives of Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle f} are strictly negative. Therefore, Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle f} is globally convex, which means that it has a unique global maximum, at Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle B^\star} . As a consequence, we have the right to use Laplace's method to approximate the integral of Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle f} around its maximum. implementation in R > to do finding the effect sizes and their std error: to do
to do
to do
to do
For instance, samples from individual 1 and 2 are correlated with each other because they were treated another day than all the other samples. Such a correlations has nothing to do with the genotype at a given SNP (in most cases). However, the core model, Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle y_i = \mu + \beta g_i + \epsilon_i} assumes that the errors are uncorrelated between individuals: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \epsilon_i \overset{\mathrm{i.i.d}}{\sim} \mathcal{N}(0,\tau^{1})} . If this is not the case, i.e. if the Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle y_i} 's are correlated but this correlation has nothing to do with the Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle g_i} 's, then more variance in the errors won't be accounted for, and we'll loose power when trying to detect weak, yet nonzero Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle \beta} . An intuitive way of removing these confounders is to realize that we can use all gene expression levels to try to identify them. Indeed, batch effects are very likely to affect all genes in a sample (though possibly at different magnitudes). As the effect of the confounders are, as a first approximation, typically much bigger than the effect of a SNP genotype, we can try to learn the confounders using all gene expression levels, and only them. So let's put all of them into a Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle G \times N} matrix Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle Y_1} with genes in rows and individuals in columns. For the moment, the data are expressed in the standard basis, i.e. the basis of the observations. But some confounders are present in these data, they contribute with noise and redundancy and hence dilute the signal. The idea is, first, to identify a new basis which will correspond to a "mix" of the original samples (e.g. one component of this "mix" may correspond to the day at which the samples were processed), and second, to remove these components from the data in order to only keep the signal. to be continued see also factor analysis, see Stegle et al (PLoS Computational Biology, 2010)
to do
to do
to do
to do
