Analyse two heuristics (early stopping and weight decay) to
avoid overfitting for the training of multilayer neural networks
Use the Boston Housing dataset housing.mat
contained in the archive housing.zip See also
housing-description.txt for more information on the
Initialize the random number generator using the Matlab
commands rand('state',<MatrNmr>); and
Split the dataset randomly (a useful command is
randperm) in a training set (50%), a validation set
(25%)and a test set
(25%). Normalize the
data with prestd.
Train a two layer network with the Quasi-Newton method
trainbfg and hidden units on the training set
without heuristics to avoid overfitting.
with early stopping (hand over the validation set
to the function
with weight decay (use net.performFcn =
'msereg' and net.performParam.ratio = 0.5).
Repeat these three points with
. Use the
default parameters and train for maximal 500 epochs.
Create a plot which shows for (a) - (c) the MSE of the trained
networks on the test set
in dependence on .
Interpret the plot. How big is the benefit of each method?
Which method seems to be most favorable. What are the advantages
and disadvantages of each method? Could the dataset be used better
for the weight decay heuristics?
Hand in your matrikel number and the first 10 elements of each
of the sets ,
Present your results clearly, structured and legible. Document
them in such a way that anybody can easily reproduce them.
Please hand in the print out of the Matlab program you have