User:Timothee Flutre/Notebook/Postdoc/2012/01/02: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
(Autocreate 2012/01/02 Entry for User:Timothee_Flutre/Notebook/Postdoc)
 
(→‎Entry title: start describing multivar normal distrib)
Line 6: Line 6:
| colspan="2"|
| colspan="2"|
<!-- ##### DO NOT edit above this line unless you know what you are doing. ##### -->
<!-- ##### DO NOT edit above this line unless you know what you are doing. ##### -->
==Entry title==
==About the multivariate Normal distribution==
* Insert content here...


* '''Motivation''': when we measure things, we often have to measure several properties for each item. For instance, for each person, we measure the expression level of all genes in his sample.
* '''Data''': we have N observations, noted <math>X = (x_1, x_2, ..., x_N)</math>, each being of dimension <math>P</math>. This means that each <math>x_i</math> is a vector belonging to <math>\mathbb{R}^P</math>.
* '''Model''': we suppose that the <math>x_i</math> are independent and identically distributed according to a [http://en.wikipedia.org/wiki/Multivariate_normal_distribution multivariate Normal distribution] <math>N_P(\mu, \Sigma)</math>. <math>\mu</math> is the P-dimensional mean vector, and <math>\Sigma</math> the PxP covariance matrix. If <math>\Sigma</math> is [http://en.wikipedia.org/wiki/Positive-definite_matrix positive definite] (which we will assume), the density function for a given x is: <math>f(x/\mu,\Sigma) = (2 \pi)^{-P/2} |\Sigma|^{-1/2} exp(-\frac{1}{2} (x-\mu)^T \Sigma^{-1} (x-\mu))</math>, with <math>|M|</math> denoting the determinant of a matrix and <math>M^T</math> its transpose.
* '''Likelihood''': as usual, we will start by writing down the likelihood of the data, the parameters being <math>\theta=(\mu,\Sigma)</math>:
<math>L(\theta) = \mathbb{P}(X/\theta)</math>
As the observations are independent:
<math>L(\theta) = \prod_{i=1}^N f(x_i / \theta)</math>
It is easier to work with the log-likelihood:
<math>l(\theta) = ln(L(\theta)) = \sum_{i=1}^N ln( f(x_i / \theta) )</math>
<math>l(\theta) = -\frac{NP}{2} ln(2\pi) - \frac{N}{2}ln(|\Sigma|) - \frac{1}{2} \sum_{i=1}^N (x_i-\mu)^T \Sigma^{-1} (x_i-\mu)</math>


<!-- ##### DO NOT edit below this line unless you know what you are doing. ##### -->
<!-- ##### DO NOT edit below this line unless you know what you are doing. ##### -->

Revision as of 09:35, 2 January 2012

Project name <html><img src="/images/9/94/Report.png" border="0" /></html> Main project page
<html><img src="/images/c/c3/Resultset_previous.png" border="0" /></html>Previous entry<html>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</html>Next entry<html><img src="/images/5/5c/Resultset_next.png" border="0" /></html>

About the multivariate Normal distribution

  • Motivation: when we measure things, we often have to measure several properties for each item. For instance, for each person, we measure the expression level of all genes in his sample.
  • Data: we have N observations, noted [math]\displaystyle{ X = (x_1, x_2, ..., x_N) }[/math], each being of dimension [math]\displaystyle{ P }[/math]. This means that each [math]\displaystyle{ x_i }[/math] is a vector belonging to [math]\displaystyle{ \mathbb{R}^P }[/math].
  • Model: we suppose that the [math]\displaystyle{ x_i }[/math] are independent and identically distributed according to a multivariate Normal distribution [math]\displaystyle{ N_P(\mu, \Sigma) }[/math]. [math]\displaystyle{ \mu }[/math] is the P-dimensional mean vector, and [math]\displaystyle{ \Sigma }[/math] the PxP covariance matrix. If [math]\displaystyle{ \Sigma }[/math] is positive definite (which we will assume), the density function for a given x is: [math]\displaystyle{ f(x/\mu,\Sigma) = (2 \pi)^{-P/2} |\Sigma|^{-1/2} exp(-\frac{1}{2} (x-\mu)^T \Sigma^{-1} (x-\mu)) }[/math], with [math]\displaystyle{ |M| }[/math] denoting the determinant of a matrix and [math]\displaystyle{ M^T }[/math] its transpose.
  • Likelihood: as usual, we will start by writing down the likelihood of the data, the parameters being [math]\displaystyle{ \theta=(\mu,\Sigma) }[/math]:

[math]\displaystyle{ L(\theta) = \mathbb{P}(X/\theta) }[/math]

As the observations are independent:

[math]\displaystyle{ L(\theta) = \prod_{i=1}^N f(x_i / \theta) }[/math]

It is easier to work with the log-likelihood:

[math]\displaystyle{ l(\theta) = ln(L(\theta)) = \sum_{i=1}^N ln( f(x_i / \theta) ) }[/math]

[math]\displaystyle{ l(\theta) = -\frac{NP}{2} ln(2\pi) - \frac{N}{2}ln(|\Sigma|) - \frac{1}{2} \sum_{i=1}^N (x_i-\mu)^T \Sigma^{-1} (x_i-\mu) }[/math]