Computational Intelligence, SS08
2 VO 442.070 + 1 RU 708.070 last updated:
General
Course Notes (Skriptum)
Online Tutorials
Practical Course Slides
Homework
Assignments
Scores
Guidelines
Archive
Exams
Animated Algorithms
Interactive Tests
Key Definitions
Downloads
Literature and Links
News
mailto:webmaster

Homework 45: Digit classification with backprop



[Points: 12.5; Issued: 2007/03/30; Deadline: 2007/05/08; Tutor: Susanne Rexeis; Infohour: 2007/04/27, 15:15-16:15, HSi11; Einsichtnahme: 2007/05/25, 15:15-16:15, HSi11; Download: pdf; ps.gz]







Neural Networks as Feature Generator [5 points]

Show that the hidden units of a network may find meaningful feature groupings in the following problem based on optical digit recognition.
  • Let your input space consist of a 8x8 pixel grid. Generate a 100 training patterns for a $ 8$ category in the following way. Start with a block letter representation of $ 8$, where black pixels have values 0 and the white pixels +1. Generate 100 different versions of this prototype by adding independent random noise to each pixel. Let the distribution of the noise be uniform between $ -0.5$ and $ 0.5$. Repeat the above procedure for the digits 0 and $ 3$ by removing some black pixels from the original (without noise) version of $ 8$. This gives you a dataset of 300 training patterns.
  • Train a 64-2-3 network with logsig activation functions for this classification task. Use the training method traingdx1 with standard parameters for training, train the network for 500 epoches.
  • Display the input to hidden weights as 8x8 images seperately for each hidden unit.
  • Can you find any useful features in the weight patterns (features are in this case areas with the same weight value)? Interpret your results, in particular discuss why such a feature representation has been chosen by the hidden layer.

Hints

  • Before network training set the weight and bias values to small but nonzero values.
  • To visualize the hidden layer weights you can use the commands
      i = 1; 
      hiddenW1 = reshape(net.IW{1}(neuron, :), 8, 8);
      imagesc(hiddenW1); colormap(1-gray);
    
    to produce an image of the weights of the i-th hidden neuron.

Digit Classification with real world data [7.5 points]

Similarly to the Optical Character Recognition tutorial you are asked to train a feed forward network to perform a digit classification task. The difference to the tutorial is the data set: here you use the Digits data set. The file digits.mat (which you get when unzipping digits.zip) contains about 4000 images (8 pixel $ \times$ 8 pixel $ \times$ 16 colors) of training samples (learn.P) of hand drawn digits (0,1,...,9) and their classification (learn.T) and about 2000 images of test samples (test.P, test.T).

  • Report the performance (i.e. percentage of test examples correctly classified) of 3 network architectures. Use a network without hidden units, with 4 hidden units and with 20 hidden units.
  • Use the training training function traingdx2. Find good training parameters (learning rate and momentum term) such that an optimal test performance is achieved.
  • Train the network without heuristics to avoid overfitting (don't use early stopping or weight decay).
  • Discuss the relationship between the network architecture and the performance on the test set.
  • What can one conclude about the complexity of the task if you consider the performance of the network without any hidden units?

Hints

  • Normalize the data using prestd and trastd.
  • Before network training set the weight and bias values to small but nonzero values.
  • To see how the digits look like you can use the commands
      i=5; 
      p=learn.P(:,i);
      x=reshape(p,8,8)';
      imagesc(x); colormap(1-gray);
    
    to produce an image of the i-th training example.

Remarks

  • Present your results clearly, structured and legible. Document them in such a way that anybody can reproduce them effortless.
  • Use the Matrikel Number of one of your team members to initialize the random number generator.
  • Please hand in the print out of the Matlab program you have used.




Fußnoten

... traingdx1
Gradient descent with momentum and adaptive learning rate.
... traingdx2
Gradient descent with momentum and adaptive learning rate.