User:Timothee Flutre/Notebook/Postdoc/2012/08/16

From OpenWetWare

(Difference between revisions)
Jump to: navigation, search
(Autocreate 2012/08/16 Entry for User:Timothee_Flutre/Notebook/Postdoc)
(Entry title: first version)
Line 6: Line 6:
| colspan="2"|
| colspan="2"|
<!-- ##### DO NOT edit above this line unless you know what you are doing. ##### -->
<!-- ##### DO NOT edit above this line unless you know what you are doing. ##### -->
-
==Entry title==
+
==Variational Bayes approach for the mixture of Normals==
-
* Insert content here...
+
 
 +
* '''Motivation''': I have described on [http://openwetware.org/wiki/User:Timothee_Flutre/Notebook/Postdoc/2011/12/14 another page] the basics of mixture models and the EM algorithm in a frequentist context. It is worth reading before continuing. Here I am interested in the Bayesian approach as well as in a specific variational method (nicknamed "Variational Bayes").
 +
 
 +
 
 +
* '''Data''': we have N univariate observations, <math>y_1, \ldots, y_N</math>, gathered into the vector <math>\mathbf{y}</math>.
 +
 
 +
 
 +
* '''Assumptions''': we assume the observations to be exchangeable and distributed according to a mixture of K Normal distributions. The parameters of this model are the mixture weights (<math>w_k</math>), the means (<math>\mu_k</math>) and the precisions (<math>\tau_k</math>) of each mixture components, all gathered into <math>\Theta = \{w_1,\ldots,w_K,\mu_1,\ldots,\mu_K,\tau_1,\ldots,\tau_K\}</math>. There are two constraints: <math>\sum_{k=1}^K w_k = 1</math> and <math>\forall k \; w_k > 0</math>.
 +
 
 +
 
 +
* '''Observed likelihood''': <math>p(\mathbf{y} | \Theta, K) = \prod_{n=1}^N p(y_n|\Theta,K) = \prod_{n=1}^N \sum_{k=1}^K w_k Normal(y_n;\mu_k,\tau_k)</math>
 +
 
 +
 
 +
* '''Latent variables''': let's introduce N latent variables, <math>z_1,\ldots,z_N</math>, gathered into the vector <math>\mathbf{z}</math>. Each <math>z_n</math> is a vector of length K with a single 1 indicating the component to which the <math>n^{th}</math> observation belongs, and K-1 zeroes.
 +
 
 +
 
 +
* '''Augmented likelihood''': <math>p(\mathbf{y},\mathbf{z}|\Theta,K) = \prod_{n=1}^N p(y_n,z_n|\Theta,K) = \prod_{n=1}^N p(z_n|\Theta,K) p(y_n|z_n,\Theta,K) = \prod_{n=1}^N \prod_{k=1}^K w_k^{z_{nk}} Normal(y_n;\mu_k,\tau_k)^{z_{nk}}</math>
 +
 
 +
 
 +
* '''Priors''': we choose conjuguate ones
 +
** for the parameters: <math>\forall k \; \mu_k \sim Normal(\mu_0,\tau_0)</math> and <math>\forall k \; \tau_k \sim Gamma(\alpha,\beta)</math>
 +
** for the latent variables: <math>\forall n \; z_n \sim Multinomial_K(1,\mathbf{w})</math> and <math>\mathbf{w} \sim Dirichlet(\gamma)</math>

Revision as of 13:49, 16 August 2012

Project name Main project page
Previous entry      Next entry

Variational Bayes approach for the mixture of Normals

  • Motivation: I have described on another page the basics of mixture models and the EM algorithm in a frequentist context. It is worth reading before continuing. Here I am interested in the Bayesian approach as well as in a specific variational method (nicknamed "Variational Bayes").


  • Data: we have N univariate observations, y_1, \ldots, y_N, gathered into the vector \mathbf{y}.


  • Assumptions: we assume the observations to be exchangeable and distributed according to a mixture of K Normal distributions. The parameters of this model are the mixture weights (wk), the means (μk) and the precisions (τk) of each mixture components, all gathered into \Theta = \{w_1,\ldots,w_K,\mu_1,\ldots,\mu_K,\tau_1,\ldots,\tau_K\}. There are two constraints: \sum_{k=1}^K w_k = 1 and \forall k \; w_k > 0.


  • Observed likelihood: p(\mathbf{y} | \Theta, K) = \prod_{n=1}^N p(y_n|\Theta,K) = \prod_{n=1}^N \sum_{k=1}^K w_k Normal(y_n;\mu_k,\tau_k)


  • Latent variables: let's introduce N latent variables, z_1,\ldots,z_N, gathered into the vector \mathbf{z}. Each zn is a vector of length K with a single 1 indicating the component to which the nth observation belongs, and K-1 zeroes.


  • Augmented likelihood: p(\mathbf{y},\mathbf{z}|\Theta,K) = \prod_{n=1}^N p(y_n,z_n|\Theta,K) = \prod_{n=1}^N p(z_n|\Theta,K) p(y_n|z_n,\Theta,K) = \prod_{n=1}^N \prod_{k=1}^K w_k^{z_{nk}} Normal(y_n;\mu_k,\tau_k)^{z_{nk}}


  • Priors: we choose conjuguate ones
    • for the parameters: \forall k \; \mu_k \sim Normal(\mu_0,\tau_0) and \forall k \; \tau_k \sim Gamma(\alpha,\beta)
    • for the latent variables: \forall n \; z_n \sim Multinomial_K(1,\mathbf{w}) and \mathbf{w} \sim Dirichlet(\gamma)



Personal tools