
Gaussian Mixture Model
Introduction
The EM algorithm is short for ExpectationMaximization
algorithm. It is based on an iterative optimization of the centers
and widths of the kernels. The aim is to optimize the likelihood
that the given data points are generated by a mixture of Gaussians.
The numbers next to the Gaussians give the relative importance
(amplitude) of each component.
Credits
This applet was developed by
S.
Akaho and adapted by
Olivier
Michel.
Instructions
 You can select the Gaussian Mixture fitting by choosing
"GaussMix", or Multiple Line fitting by choosing
"LineMix".
 Click in the image plane to add sample points.
 You can add 10 uniformly distributed sample points by clicking
on "RandomPts".
 You can also add 100 uniformly distributed sample points lying
on a ring by clicking on "RingPts".
 "ClearPts" allows to remove all sample points
 You can choose the number of kernels by selecting 1 to
5 with the second rightmost button.
 You can initialize the kernels by clicking on
"InitKernels".
 You can start the EM algorithm by selecting "EM Run" or
"EM 1 Step" (one step)
 You can stop "EM Run" by selecting "EM
Stop".
Applet
Questions
 Define about 30 data points located in 3 separate clusters. One
of the cluster may be a little bit larger than the others. Set the
number of kernels to 3. Switch the rightmost button to 'EM 1
Step'. Click repeatedly on this button and watch how the kernels
converge to the clusters.
 Keep the same data points and reinitialize the system. You can
switch to 'EM Run', watch how it converges, and the reinitialize
again. Repeat this several times in order to find out how reliably
the EM algorithm finds the 3 clusters.
 Use the same data points, but allow for 5 kernels. Does it
still converge to a reasonable solution? Repeat the experiment
several times.
 Add random data points by pressing the random points button
once. Restart the algorithm. Does it still find the original
clusters? (Test it several times). Add more random points by
clicking again on the random points button and repeat the
experiment.
 Keep all data points and reduce the number of kernels back to
3. Does the algorithm still find the clusters?
 Clear the data points and restart with two clusters (each 10
points) and two kernels. Add more and more points to one of the
clusters and watch how the importance of the respective Gaussian
component increases.
Bugs
If Gaussians or Lines disappear by EM, initialize kernels by
click "InitKernels".
EM algorithm sometimes stops by Applet exception error in Netscape
Navigator for Windows95 and Solaris2.4 (netscape for SGI is rather
stable).
