User:Timothee Flutre/Notebook/Postdoc/2012/08/16: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
(→‎Entry title: first version)
m (→‎Variational Bayes approach for the mixture of Normals: fix error prior \mu_k + add link precision)
Line 14: Line 14:




* '''Assumptions''': we assume the observations to be exchangeable and distributed according to a mixture of K Normal distributions. The parameters of this model are the mixture weights (<math>w_k</math>), the means (<math>\mu_k</math>) and the precisions (<math>\tau_k</math>) of each mixture components, all gathered into <math>\Theta = \{w_1,\ldots,w_K,\mu_1,\ldots,\mu_K,\tau_1,\ldots,\tau_K\}</math>. There are two constraints: <math>\sum_{k=1}^K w_k = 1</math> and <math>\forall k \; w_k > 0</math>.
* '''Assumptions''': we assume the observations to be exchangeable and distributed according to a mixture of K Normal distributions. The parameters of this model are the mixture weights (<math>w_k</math>), the means (<math>\mu_k</math>) and the [http://en.wikipedia.org/wiki/Precision_%28statistics%29 precisions] (<math>\tau_k</math>) of each mixture components, all gathered into <math>\Theta = \{w_1,\ldots,w_K,\mu_1,\ldots,\mu_K,\tau_1,\ldots,\tau_K\}</math>. There are two constraints: <math>\sum_{k=1}^K w_k = 1</math> and <math>\forall k \; w_k > 0</math>.




* '''Observed likelihood''': <math>p(\mathbf{y} | \Theta, K) = \prod_{n=1}^N p(y_n|\Theta,K) = \prod_{n=1}^N \sum_{k=1}^K w_k Normal(y_n;\mu_k,\tau_k)</math>
* '''Observed likelihood''': <math>p(\mathbf{y} | \Theta, K) = \prod_{n=1}^N p(y_n|\Theta,K) = \prod_{n=1}^N \sum_{k=1}^K w_k Normal(y_n;\mu_k,\tau_k^{-1})</math>




Line 23: Line 23:




* '''Augmented likelihood''': <math>p(\mathbf{y},\mathbf{z}|\Theta,K) = \prod_{n=1}^N p(y_n,z_n|\Theta,K) = \prod_{n=1}^N p(z_n|\Theta,K) p(y_n|z_n,\Theta,K) = \prod_{n=1}^N \prod_{k=1}^K w_k^{z_{nk}} Normal(y_n;\mu_k,\tau_k)^{z_{nk}}</math>
* '''Augmented likelihood''': <math>p(\mathbf{y},\mathbf{z}|\Theta,K) = \prod_{n=1}^N p(y_n,z_n|\Theta,K) = \prod_{n=1}^N p(z_n|\Theta,K) p(y_n|z_n,\Theta,K) = \prod_{n=1}^N \prod_{k=1}^K w_k^{z_{nk}} Normal(y_n;\mu_k,\tau_k^{-1})^{z_{nk}}</math>




* '''Priors''': we choose conjuguate ones
* '''Priors''': we choose conjuguate ones
** for the parameters: <math>\forall k \; \mu_k \sim Normal(\mu_0,\tau_0)</math> and <math>\forall k \; \tau_k \sim Gamma(\alpha,\beta)</math>
** for the parameters: <math>\forall k \; \mu_k | \tau_k \sim Normal(\mu_0,(\tau_0 \tau_k)^{-1})</math> and <math>\forall k \; \tau_k \sim Gamma(\alpha,\beta)</math>
** for the latent variables: <math>\forall n \; z_n \sim Multinomial_K(1,\mathbf{w})</math> and <math>\mathbf{w} \sim Dirichlet(\gamma)</math>
** for the latent variables: <math>\forall n \; z_n \sim Multinomial_K(1,\mathbf{w})</math> and <math>\mathbf{w} \sim Dirichlet(\gamma)</math>



Revision as of 11:29, 31 August 2012

Project name <html><img src="/images/9/94/Report.png" border="0" /></html> Main project page
<html><img src="/images/c/c3/Resultset_previous.png" border="0" /></html>Previous entry<html>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</html>Next entry<html><img src="/images/5/5c/Resultset_next.png" border="0" /></html>

Variational Bayes approach for the mixture of Normals

  • Motivation: I have described on another page the basics of mixture models and the EM algorithm in a frequentist context. It is worth reading before continuing. Here I am interested in the Bayesian approach as well as in a specific variational method (nicknamed "Variational Bayes").


  • Data: we have N univariate observations, [math]\displaystyle{ y_1, \ldots, y_N }[/math], gathered into the vector [math]\displaystyle{ \mathbf{y} }[/math].


  • Assumptions: we assume the observations to be exchangeable and distributed according to a mixture of K Normal distributions. The parameters of this model are the mixture weights ([math]\displaystyle{ w_k }[/math]), the means ([math]\displaystyle{ \mu_k }[/math]) and the precisions ([math]\displaystyle{ \tau_k }[/math]) of each mixture components, all gathered into [math]\displaystyle{ \Theta = \{w_1,\ldots,w_K,\mu_1,\ldots,\mu_K,\tau_1,\ldots,\tau_K\} }[/math]. There are two constraints: [math]\displaystyle{ \sum_{k=1}^K w_k = 1 }[/math] and [math]\displaystyle{ \forall k \; w_k \gt 0 }[/math].


  • Observed likelihood: [math]\displaystyle{ p(\mathbf{y} | \Theta, K) = \prod_{n=1}^N p(y_n|\Theta,K) = \prod_{n=1}^N \sum_{k=1}^K w_k Normal(y_n;\mu_k,\tau_k^{-1}) }[/math]


  • Latent variables: let's introduce N latent variables, [math]\displaystyle{ z_1,\ldots,z_N }[/math], gathered into the vector [math]\displaystyle{ \mathbf{z} }[/math]. Each [math]\displaystyle{ z_n }[/math] is a vector of length K with a single 1 indicating the component to which the [math]\displaystyle{ n^{th} }[/math] observation belongs, and K-1 zeroes.


  • Augmented likelihood: [math]\displaystyle{ p(\mathbf{y},\mathbf{z}|\Theta,K) = \prod_{n=1}^N p(y_n,z_n|\Theta,K) = \prod_{n=1}^N p(z_n|\Theta,K) p(y_n|z_n,\Theta,K) = \prod_{n=1}^N \prod_{k=1}^K w_k^{z_{nk}} Normal(y_n;\mu_k,\tau_k^{-1})^{z_{nk}} }[/math]


  • Priors: we choose conjuguate ones
    • for the parameters: [math]\displaystyle{ \forall k \; \mu_k | \tau_k \sim Normal(\mu_0,(\tau_0 \tau_k)^{-1}) }[/math] and [math]\displaystyle{ \forall k \; \tau_k \sim Gamma(\alpha,\beta) }[/math]
    • for the latent variables: [math]\displaystyle{ \forall n \; z_n \sim Multinomial_K(1,\mathbf{w}) }[/math] and [math]\displaystyle{ \mathbf{w} \sim Dirichlet(\gamma) }[/math]