next up previous
Next: Function Minimization [3 P] Up: NNA_Exercises_2009 Previous: Iterative reweighted least squares

Quasi-Newton's Method [3 P]

Derive an expression for the outer product (Quasi-Newton) approximation to the Hessian matrix for a network having $ K$ outputs with a softmax output unit activation function

$\displaystyle y_k({\bf x},{\bf w}) = \frac{\exp(o_k({\bf x},{\bf w}))}{\sum_{j=1}^K\exp(o_j({\bf x},{\bf w}))},$    

and output unit activations $ o_k$ , where $ k = 1,...,K$ , and a cross-entropy error function $ E_{CE}$ , corresponding to the result

$\displaystyle H({\bf w}) \approx J({\bf w})^T J({\bf w})$    

with $ J_{ki}({\bf w}) = \frac{\partial o_k}{\partial w_i}$ for the sum-of-squares error function

$\displaystyle E = \frac{1}{2}\sum_{n=1}^{N}(y_n - t_n)^2$    

and a linear output unit activation function, i.e. $ y_n = o_n$ .

Haeusler Stefan 2010-01-19