next up previous
Next: Kernel PCA [4 P] Up: MLA_Exercises_2007 Previous: Properties of Kernels [6

Regression Support Vector Machine [2* P]

Consider the Lagrangian

$\displaystyle L$ $\displaystyle =$ $\displaystyle C \sum_{n=1}^{N}(\xi_n + \hat{\xi}_n) + \frac{1}{2} \Vert {\bf w} \Vert^2 - \sum_{n=1}^{N}(\mu_n\xi_n + \hat{\mu}_n\hat{\xi}_n)$  
    $\displaystyle - \sum_{n=1}^{N} a_n(\epsilon +\xi_n+y_n-t_n)
- \sum_{n=1}^{N} \hat{a}_n(\epsilon +\hat{\xi}_n-y_n+t_n)$  

for the regression support vector machine, where

$\displaystyle y({\bf x}) = {\bf w}^T\phi({\bf x}) + b,$

$ a_n \geq 0$, $ \hat{a}_n \geq 0$, $ \mu_n \geq 0$ and $ \hat{\mu}_n \geq 0$ are the Lagrangian multipliers and $ \xi_n \geq 0$, $ \hat{\xi}_n \geq 0$ are the slack variables.

By setting the derivatives of the Lagrangian with respect to $ \bf w$, $ b$, $ \xi_n$, and $ \hat{\xi}_n$ to zero and then back substituting to eliminate the corresponding variables, show that the dual Lagrangian is given by

$\displaystyle \tilde{L}({\bf a},\hat{{\bf a}})$ $\displaystyle =$ $\displaystyle -\frac{1}{2}\sum_{n=1}^{N} \sum_{m=1}^{N}
(a_n - \hat{a}_n)(a_m - \hat{a}_m)k({\bf x}_n,{\bf x}_m)$  
    $\displaystyle -\epsilon \sum_{n=1}^{N}(a_n + \hat{a}_n) + \sum_{n=1}^{N}(a_n - \hat{a}_n)t_n.$  



Haeusler Stefan 2007-12-03