Computational Intelligence, SS08
2 VO 442.070 + 1 RU 708.070 last updated:
General
Course Notes (Skriptum)
Online Tutorials
Practical Course Slides
Homework
Assignments
Scores
Guidelines
Archive
Exams
Animated Algorithms
Interactive Tests
Key Definitions
Downloads
Literature and Links
News
mailto:webmaster

Homework 25: Gradient descent learning rule



[Points: 8; Issued: 2005/03/18; Deadline: 2005/04/28; Tutor: Peter Bliem; Infohour: 2005/04/25, 12:00-13:00, Seminarraum IGI; Einsichtnahme: 2005/05/16, 12:00-13:00, Seminarraum IGI; Download: pdf; ps.gz]





Consider a feedforward network of depth 3 with 4 inputs, 2 sigmoidal gates on each of the 2 hidden layers, and a linear output gate. Derive the learning rule for each of the weights in the network when you apply gradient descent (with learning rate $ \eta$) to the MSE for a single training example $ \langle \mathbf{a},b\rangle $.

Compare the learning rule to the general backprop rule. In particular you should explicitly state the value of the parameter $ \alpha$ for the considered network.

Note:

You can get 3 $ *$-points if you also derive the learning rules for the weights for the corresponding network where the units on the hidden layer are Radial Basis Funktion (RBF) units, as defined in section 1.6 of Supervised Learning for Neural Networks: a tutorial with JAVA exercises1by W. Gerstner. Find the learning rule for the weights in layer 1 and 2 if you apply gradient descent (learning rate $ \eta$) to the MSE for a single training example $ \langle \mathbf{a},b\rangle $.




Fußnoten

... exercises1
http://diwww.epfl.ch/mantra/tutorial/docs/supervised.pdf