Next: Decision Boundaries of Backprop Up: NNA_Exercises_2009 Previous: Function Minimization [3 P]

# Digit Classification [3 P]

Compare different conjugate gradient and Quasi-Newton methods for a digit classification task. You are required to use MATLAB for this assignment.

a)

Download the digit data set digits.zip 4. The file digits.mat contains training samples (learn.X, learn.C) and test samples (test.X, test.C). Each sample consists of 64 pixel values in the range .
(use d=reshape (learn.X(7,:),8,8)'; imagesc(d); colormap(1-gray); to visualize training sample 7)

Normalize the data using mapstd or prestd and trastd.

b)

Initialize the MATLAB random number generator with rand('state',MatrNmr); and randn('state',MatrNmr); (use only the MatrNmr of one team member).

c)

Use a conjugate gradient method of your choice (e.g.  traincgf) and a quasi-Newton method (e.g.  trainlm, see  help nnet for a list of available training functions) to train a network of your choice (choose an appropriate network architecture and activation functions), so that it achieves a good generalization capability on the test data. If necessary adjust the training parameters net.trainParam of the network (e.g. see help traincgf for a list of training parameters for  traincgf). Avoid overfitting by means of a method of your choice (early stopping or weight decay). Use train(net,X,T,[],[],V) to hand over a validation set to the training algorithm and automatically activate early stopping.

Choose an appropriate training and validation set from learn.X and learn.C. (Hint: Because of the huge size of the training set do not use all training samples to train the network.)

Before network training set the weight and bias values to small but nonzero values.

Justify and explain your choice for the nework parameters and training functions.

d)

Compare the convergence speed of the different training methods for the network architecture obtained in c).

Present your results clearly, structured and legible. Document them in such a way that anybody can easily reproduce them.

Next: Decision Boundaries of Backprop Up: NNA_Exercises_2009 Previous: Function Minimization [3 P]
Haeusler Stefan 2010-01-19