next up previous
Next: Perceptron learning rule [2 Up: NNA_Exercises_2009 Previous: PCA vs. Fisher's Linear

Cross-Entropy Error Function [2* P]

Consider a neural network with one output, $ h$ hidden neurons and $ m$ inputs. For two-class classification problems with classes $ C_0$ and $ C_1$ we want the neural network to compute the probability $ P(C_1 \vert {\bf x}) = y$ that an input vector $ {\bf x}$ belongs to class $ C_1$ (and so $ P(C_0 \vert {\bf x}) = 1-y$ ). For this case we often use the cross-entropy error function (where $ y_n$ is the network's prediction for $ P(C_1 \vert {\bf x}_n)$ and $ t_n \in \{0, 1\}$ ):

$\displaystyle \displaystyle E_{CE}\left( \{ \langle {\bf x_1}, t_1 \rangle, \ld...
... - \sum_{n=1}^N \left[ t_n \cdot \ln y_n + (1 - t_n) \cdot \ln(1-y_n) \right ] $

Derive the backpropagation update of the weights in both layers for the cross-entropy error function, assuming the neurons compute the logistic activation function $ \sigma(x) = \frac{1}{1+\exp(-x)}$ . (Hint: Prove first that $ \sigma'(x) = \sigma(x) \cdot (1 - \sigma(x))$ ).



Haeusler Stefan 2010-01-19