Consider a multilayer network with arbitrary feed-forward topology, which is to be trained by minimizing the tangent propagation error function

(7) |

where is a regularization coefficient,

(8) |

and denotes the Jacobian matrix for input . Show that the regularization term can be written as a sum over patterns of terms of the form

(9) |

where is a differential operator defined by

(10) |

By acting on the forward propagation equations

(11) |

with the operator , show that can be evaluated by forward propagation using the following equations:

(12) |

where we have defined the new variables

(13) |

Now show that the derivatives of with respect to a weight in the network can be written in the form

(14) |

where we have defined

(15) |

Write down the backpropagation equations for , and hence derive a set of backpropagation equations for the evaluation of the .