Computational Intelligence, SS08
2 VO 442.070 + 1 RU 708.070 last updated:
General
Course Notes (Skriptum)
Online Tutorials
Practical Course Slides
Homework
Assignments
Scores
Guidelines
Archive
Exams
Animated Algorithms
Interactive Tests
Key Definitions
Downloads
Literature and Links
News
mailto:webmaster

Homework 36: Digit classification with backprop



[Points: 12.5; Issued: 2006/03/22; Deadline: 2006/05/10; Tutor: Arian Mavriqi; Infohour: 2006/05/08, 13:00-14:00, HSi13; Einsichtnahme: 2006/05/22, 13:00-14:00, HSi13; Download: pdf; ps.gz]





Similarly to the Optical Character Recognition tutorial you are asked to train a feed forward network to perform a digit classification task. The difference to the tutorial is the data set: here you use the Digits data set. The file digits.mat (which you get when unzipping digits.zip) contains about 4000 images (8 pixel $ \times$ 8 pixel $ \times$ 16 colors) of training samples (learn.P) of hand drawn digits (0,1,...,9) and their classification (learn.T) and about 2000 images of test samples (test.P, test.T).

  • The goal is to find a suitable network architecture (i.e. number of layers and number of hidden units) and training parameters (i.e. learning rate and momentum term) for the training function traingdx1 such that an optimal test performance is achieved.
  • Report the performance (i.e. percentage of test examples correctly classified) of at least 3 network architectures you have tried. One of them must be a network with no hidden units. No network should consist of more than 20 hidden units.
  • Train the network without heuristics to avoid overfitting (don't use early stopping or weight decay).
  • Discuss the relationship between the network architecture and the performance on the test set.
  • What can one conclude about the complexity of the task if you consider the performance of the network without any hidden units?

Note:

Each member of the team handing in the network with the highest performance will get 3 *-points.

Hints

  • Normalize the data using prestd and trastd.
  • Before network training set the weight and bias values to small but nonzero values.
  • To see how the digits look like you can use the commands
      i=5; 
      p=learn.P(:,i);
      x=reshape(p,8,8)';
      imagesc(x); colormap(1-gray);
    
    to produce an image of the i-th training example.

Remarks

  • Present your results clearly, structured and legible. Document them in such a way that anybody can reproduce them effortless.
  • Please hand in the print out of the Matlab program you have used.




Fußnoten

... traingdx1
Gradient descent with momentum and adaptive learning rate.