Computational Intelligence, SS08
2 VO 442.070 + 1 RU 708.070 last updated:
General
Course Notes (Skriptum)
Online Tutorials
Practical Course Slides
Homework
Assignments
Scores
Guidelines
Archive
Exams
Animated Algorithms
Interactive Tests
Key Definitions
Downloads
Literature and Links
News
mailto:webmaster

Homework 13: Backprop and Overfitting



[Points: 8; Issued: 2004/03/04; Deadline: 2004/04/28; Tutor: Igor Vikic; Infohour: 2004/04/26, 12:00-13:00, Seminarraum IGI; Einsichtnahme: 2004/05/17, 12:00-13:00, Seminarraum IGI; Download: pdf; ps.gz]





Analyse two heuristics (early stopping and weight decay) to avoid overfitting for the training of multilayer neural networks with backpropagation.

  1. Use the Boston Housing dataset housing.mat contained in the archive housing.zip See also housing-description.txt for more information on the data set.
  2. Initialize the random number generator using the Matlab commands rand('state',<MatrNmr>); and randn('state',<MatrNmr>);.
  3. Split the dataset randomly (a useful command is randperm) in a training set $ D$ (50%), a validation set $ V$ (25%)and a test set $ T$ (25%). Normalize the data with prestd.
  4. Train a two layer network with the Quasi-Newton method trainbfg and $ n_H$ hidden units on the training set $ D$
    1. without heuristics to avoid overfitting.
    2. with early stopping (hand over the validation set $ V$ to the function train).
    3. with weight decay (use net.performFcn = 'msereg' and

      net.performParam.ratio = 0.5).

    Repeat these three points with $ n_H = 1,2,4,8$. Use the default parameters and train for maximal 500 epochs.

  5. Create a plot which shows for (a) - (c) the MSE of the trained networks on the test set $ T$ in dependence on $ n_H$.
  6. Interpret the plot. How big is the benefit of each method? Which method seems to be most favorable. What are the advantages and disadvantages of each method? Could the dataset be used better for the weight decay heuristics?
  7. Hand in your matrikel number and the first 10 elements of each of the sets $ D$, $ T$ and $ V$.
  • Present your results clearly, structured and legible. Document them in such a way that anybody can easily reproduce them.
  • Please hand in the print out of the Matlab program you have used.