0

I am working in programming of GMM with EM. I am stuck with the following problem. As you will see in this website, there is a parameter "pi" which is in other words the weight or probability value.

My question is how is this calculated? Or is it in real coding is it ignored?

Alex Riley
  • 132,653
  • 39
  • 224
  • 205
kcc__
  • 1,514
  • 3
  • 27
  • 49

2 Answers2

2

pi_k is the mixture coefficient for the k-th Gaussian. You definitely can't ignore it. The maximum likelihood estimator for pi_k is the mean of the k-th indicator variables for your instances. The page you referenced calls these indicators \alpha_ik.

Ben Allison
  • 6,794
  • 1
  • 12
  • 24
0

As you may have read, each iteration of EM has two steps. an expectation step and a maximization step. At each expectation step, we have an increasingly refined idea of how much does each training sample belong to each cluster. Using this estimate, in the maximization step, we calculate the parameters of the GMM which maximizes the likelihood. pi_k is one of the thusly calculated parameters in the maximization step. So pi_k is re-evaluated at every iteration.

Using the opencv implementation of EM, if 'em_model' is your EM-model, and if it has been trained,

Mat weights = em_model.get<Mat>("weights");

will give you the values of pi_k.

koshy george
  • 611
  • 6
  • 23