
Subsections
Figure 1: Data set X
projected to two dimensions.

As example our task is to create and train
a perceptron that correctly classifies points sets belonging to
three different classes. First we load the data from the file
winedata.mat
>> load winedata X C
Each row of X represents a
sample point whose class is specified by the corresponding element
(row) in C. Further the data is transformed
into the input/output format used by the Neural Network
Toolbox
>> P=X';
where P(:,i) is the
ith point. Since we want to classify three
different classes we use 3 perceptrons, each for the classification
of one class. The corresponding target function is generated
by
>> T=ind2vec(C);
To create the perceptron layer with correct
input range type
>> net=newp(minmax(P),size(T,1));
Both functions, train
and adapt, are used for training a neural
network, and most of the time both can be used for the same
network. The most important difference has to do with incremental
training (updating the weights after the presentation of each
single training sample) versus batch training (updating the weights
after each presenting the complete data set).
First, set net.adaptFcn
to the desired adaptation function. We'll use adaptwb (from 'adapt
weights and biases'), which allows for a separate update algorithm
for each layer. Again, check the Matlab documentation for a
complete overview of possible update algorithms.
>> net.adaptFcn = 'trains';
Next, since we're using trains, we'll have to set the learning function for all
weights and biases:
>> net.inputWeights{1,1}.learnFcn = 'learnp';
>> net.biases{1}.learnFcn = 'learnp';
where learnp is the
Perceptron learning rule. Finally, a useful parameter is
net.adaptParam.passes, which is the maximum number of times the
complete training set may be used for updating the network:
>> net.adaptParam.passes = 1;
When using adapt, both incremental and batch
training can be used. Which one is actually used depends on the
format of your training set. If it consists of two matrices of
input and target vectors, like
>> [net,y,e] = adapt(net,P,T);
the network will be updated using batch
training. Note that all elements of the matrix y are one, because the weights are not updated until all
of the trainings set had been presented.
If the training set is given in the form of a cell array
>> for i = 1:length(P), P2{i} = P(:,i); T2{i}= T(:,i); end
>> net = init(net);
>> [net,y2,e2] = adapt(net,P2,T2);
then incremental training will be used. Notice
that the weights had to be initialized before the network adaption
was started. Since adapt takes a lot more time
then train we continue our analysis with second
algorithm.
When using train on the other hand, only
batch training will be used, regardless of the format of the data
(you can use both). The advantage of train is
that it provides a lot more choice in training functions (gradient
descent, gradient descent w/ momentum, LevenbergMarquardt, etc.)
which are implemented very efficiently. So for static networks (no
tapped delay lines) usually train is the better choice.
We set
>> net.trainFcn = 'trainb';
for batch learning and
>> net.trainFcn = 'trainc';
for online learning. Which training
parameters are present depends in general on your choice for the
training function. In our case two useful parameters are net.trainParam.epochs, which is the maximum number of
times the complete data set may be used for training, and net.trainParam.show, which is the time between status
reports of the training function. For example,
>> net.trainParam.epochs = 1000;
>> net.trainParam.show = 100;
We initialize and simulate the network
with
>> net = init(net);
>> [net,tr] = train(net,P,T);
The trainings error is calculated with
>> Y=sim(net,P);
>> train_error=mae(YT)
train_error =
0.3801
So we see that the three classes of the data
set were not linear seperable. The best time to stop learning would
have been
>> [min_perf,min_epoch]=min(tr.perf)
min_perf =
0.1948
min_epoch =
703
Figure 2: Performance of
the learning algorithm train over 1000
epochs.

