Consider a neural network with one output, hidden neurons and inputs. For two-class classification problems with classes and we want the neural network to compute the probability that an input vector belongs to class (and so ). For this case we often use the cross-entropy error function (where is the network's prediction for and ):
Derive the backpropagation update of the weights in both layers for the cross-entropy error function, assuming the neurons compute the logistic activation function . (Hint: Prove first that ).