Bayesian model of univariate linear regression for QTL detection
See Servin & Stephens (PLoS Genetics, 2007).
- Data: let's assume that we obtained data from N individuals. We note the (quantitative) phenotypes (e.g. expression level at a given gene), and the genotypes at a given SNP (as allele dose, 0, 1 or 2).
- Goal: we want to assess the evidence in the data for an effect of the genotype on the phenotype.
- Assumptions: the relationship between genotype and phenotype is linear; the individuals are not genetically related; there is no hidden confounding factors in the phenotypes.
where is in fact the additive effect of the SNP, noted from now on, and is the dominance effect of the SNP, .
Let's now write in matrix notation:
which gives the following conditional distribution for the phenotypes:
The likelihood of the parameters given the data is therefore:
- Priors: we use the usual conjugate prior
- Conditional posterior of B:
Here and in the following, we neglect all constants (e.g. normalization constant, , etc):
We use the prior and likelihood and keep only the terms in :
We factorize some terms:
Let's define . We can see that , which means that is a symmetric matrix.
This is particularly useful here because we can use the following equality: .
This now becomes easy to factorizes totally:
We recognize the kernel of a Normal distribution, allowing us to write the conditional posterior as:
- Posterior of :
Similarly to the equations above:
But now, to handle the second term, we need to integrate over , thus effectively taking into account the uncertainty in :
Again, we use the priors and likelihoods specified above (but everything inside the integral is kept inside it, even if it doesn't depend on !):
As we used a conjugate prior for , we know that we expect a Gamma distribution for the posterior.
Therefore, we can take out of the integral and start guessing what looks like a Gamma distribution.
We also factorize inside the exponential:
We recognize the conditional posterior of .
This allows us to use the fact that the pdf of the Normal distribution integrates to one:
We finally recognize the following Gamma distribution: