Principal Component Analysis
Introduction
A common method from statistics for analysing data is
principal component analysis (PCA). In communication theory,
it is known as the Karhunen-Loève transform. The aim is to
find a set of M orthogonal vectors in data space that
account for as much as possible of the data's variance. Projecting
the data from their original N-dimensional space onto the
M-dimensional subspace spanned by these vectors then
performs a dimensionality reduction that often retains most
of the intrinsic information in the data.
The first principal component is taken to be along the direction
with the maximum variance. The second principal component is
constrained to lie in the subspace perpendicular to the first.
Within that subspace, it points in the direction of the
maximum variance. Then, the third principal component (if any) is
taken in the maximum variance direction in the subspace
perpendicular to the first two, and so on.
Credits
The original applet was written by
Olivier
Michel.
Implementation
Principal component analysis is implemented as a neural
algorithm called APEX (Adaptive Principal component EXtraction)
developed by kung and Diamantaras (1990).
Instructions
This applet allows the user to set a number of points in a two
dimensional space by clicking with the mouse button. Then, the user
may specify a number of iterations for the neural PCA algorithm.
When pressing the PCA button, the following calculations occur:
- The origin O is computed as the gravity center of the
set of points.
- The first eigen vector is computed by the neural PCA algorithm.
The components of this vector correspond to the weiths between the
input layer and the first output unit.
- The first eigen vector is displayed in the two dimensional
space.
Applet
Questions
- Define a cluster of data points. The cluster should be not
perfectly circular but have a preferred direction. Click on PCA and
watch whether the algorithm finds the preferred direction.
- Now add more data points so that the cluster is banana shaped.
Click on PCA. Is PCA useful in this case?
|