next up previous
Next: Decision Theory [2+2* P] Up: NNA_Exercises_2012 Previous: Linear models for regression

Linear models for regression II[4 P]

Figure: Bias-variance trade-off.
Image fig_biasvariance

Apply curve fitting by finding the MAP solution for a linear model with Gaussian basis functions and a target function with Gaussian noise. You are required to use MATLAB for this assignment. A template for the MATLAB file as presented in the lecture is available for download on the course homepage1. Complete the lines marked with $ ..............$ in the file biasvariance.m (search for the tag HOMEWORK) as required for the following points.

a)
Generate $ L = 50$ datasets, each containing $ N=50$ data points, where $ x$ is drawn uniformly and independently from the interval $ [0, 1]$ and $ t = h(x) + \epsilon$ is given by the deterministic function $ h(x)=\sin(2\pi x)$ and a zero mean Gaussian random variable $ \epsilon$ with precision (inverse variance) $ \beta=2500$ .

b)
Implement $ M=24$ Gaussian basis functions defined by

$\displaystyle \phi_i(x) = \exp\left(-\frac{(x-\frac{i}{M+1})^2}{1/(2 M^2)}\right),$

with $ i=1,...,M$ for the linear model.

c)
Calculate the MAP solutions for the linear model for each of the $ L = 50$ datasets. The prior distribution for the weights of the linear model are given by

$\displaystyle P({\bf w})=\mathcal{N}({\bf w}\vert{\bf m}_0,{\bf S}_0),$

with $ {\bf m}_0 = 0 \cdot {\bf I}$ and $ {\bf S}_0 = \alpha^{-1} {\bf I}$ where $ {\bf I}$ denotes the $ M \times M$ identity matrix. Determine the average prediction

$\displaystyle \bar{y}(x) = \frac{1}{L}\sum_{l=1}^{L}y^{(l)}(x),$    

where $ y^{(l)}$ denotes the output of the model trained on datatset $ l$ , the integrated squared bias and the integrated variance
$\displaystyle (bias)^2$ $\displaystyle =$ $\displaystyle \frac{1}{K}\sum_{k=1}^{K}\{\bar{y}(x_k)-h(x_k)\}^2$  
$\displaystyle variance$ $\displaystyle =$ $\displaystyle \frac{1}{K}\sum_{k=1}^{K}\frac{1}{L}\sum_{l=1}^{L}\{y^{(l)}(x_k) - \bar{y}(x_k)\}^2,$  

where $ x_k = k/K$ and $ K = 100$ , and the average test error (mean squared error) for 1000 test samples drawn from the same distribution as the training samples. Determine each of this quantities for $ \ln \alpha = -6,-3,0,...,12$ . Plot and interpret their dependence on $ \lambda = \alpha/\beta$ and discuss the results in the context of the bias-variance trade-off. Create a plot that looks like Figure 1. What regularization parameter $ \lambda$ should be chosen.


next up previous
Next: Decision Theory [2+2* P] Up: NNA_Exercises_2012 Previous: Linear models for regression
Haeusler Stefan 2013-01-16