
[Points: 12.5; Issued: 2008/03/20; Deadline: 2008/05/16; Tutor:
Roland Unterberger; Infohour: 2008/05/09,
15:3016:30, HS i11; Einsichtnahme: 2008/05/30, 15:3016:30, HS
i11; Download: pdf; ps.gz]
Similiar to homework 2.1, a simple 1dimensional function should be
learned with feedforward neural networks. Use the same data set as for homework 1.
 Train a neural network with neurons. Use the training algorithm
'trainscg', train for epochs. Use the regularized error function
msereg. Use different regularization factors ( resp.
net.performParam.ratio in matlab) of
 Plot the mean squared error of the training and of the test set
for the given regularization factors.
 Interpret your results. What is the best value of ? Also interpret your results
with homework 2.1. Is the appropriate selection of the number of
hidden neurons, early stopping or the regularized error function
the best choice to avoid overfitting in this example? Explain your
choice![1 Extra Point]
 Normalize your input data using mapstd
 This time you CAN NOT use the performance structure returned by
the train function because it returns the regularized
error function and not the mse. Use the mse function
instead.
Overfitting with RealWorld Data [8.5 points]
In this homework you are supposed to analyse different overfitting
avoidance mechanism with neural networks. Use the housing.mat dataset which contains data about the
price of houses in boston. The task is to predict the price of
unseen houses (regression task). For a detailed description of the
dataset see housing_description.txt. You can use the script
housing_template.m as template.
 Split the dataset randomly (a useful command is
randperm) in a training set (75%) and a test set (25%).
 Train a neural network with
neurons. Use the training
algorithm trainscg and learn for 500 epochs
 Plot the mean squared error of the training and of the test set
for the given number of neurons. For the test set, plot the mean
squared error (mse) after training (
) and the
minimum mse during training (
). For which
number of neurons can we observe (with standard training, i.e. for
)
underfitting and for which overfitting?
 What is the best number of hidden neurons for standard training
(
) and what
for early stopping (
)? Is there
a difference between those two numbers? If yes, why?
 Train a neural network with 60 hidden neurons with the
regularized error function msereg. Use the training
algorithm trainscg and learn for 500 epochs. Use the
following values
:
 Plot the mean squared error of the training and of the test set
for the given . What
is the best ? Good
working values
for this example are significantly different from example 3.1. Why?
For which can we
observe underfitting and for which overfitting?
 Compare the results to standard training and early stopping
with varying number of hidden units. Which OF avoidance method
would you prefer? Explain your choice !
 Repeat the whole experiments using a 4fold crossvalidation
instead of the testset to estimate the true error. Create the same
plots and compare them. Which results are more relyable? Does your
choice of the optimal number of hidden neurons and the optimal
change?
 Normalize the data using mapstd.
 You can find the source code for splitting the data in a
training and a test set in the template.
Present your results clearly, structured and legible. Document them
in such a way that anybody can easily reproduce them. Please hand
in the print out of the Matlab program you have used (no emails!).

