Computational Intelligence, SS08
2 VO 442.070 + 1 RU 708.070 last updated:
Course Notes (Skriptum)
Online Tutorials
Introduction to Matlab
Neural Network Toolbox
OCR with ANNs
Adaptive Filters
VC dimension
Gaussian Statistics
PCA, ICA, Blind Source Separation
Hidden Markov Models
Mixtures of Gaussians
Automatic Speech Recognition
Practical Course Slides
Animated Algorithms
Interactive Tests
Key Definitions
Literature and Links

The RLS Adaptation Algorithm

The RLS (recursive least squares) algorithm is another algorithm for determining the coefficients of an adaptive filter. In contrast to the LMS algorithm, the RLS algorithm uses information from all past input samples (and not only from the current tap-input samples) to estimate the (inverse of the) autocorrelation matrix of the input vector. To decrease the influence of input samples from the far past, a weighting factor for the influence of each sample is used. This weighting factor is introduced in the cost function

$\displaystyle J[n] = \sum_{i=1}^{n} \rho ^{n-i} \vert e[i,n]\vert^2,$ (10)

where the error signal $ e[i,n]$ is computed for all times $ 1\le i\le n$ using the current filter coefficients $ \mathbf{c}[n]$: $ e[i,n] = d[i] - \mathbf{c}^{\mathsf H}[n] \mathbf{x}[i]$.

When $ \rho =1$ the squared error for all sample times $ i$ up to current time $ n$ is considered in the cost function $ J$ equally. If $ 0<\rho <1$ the influence of past error values decays exponentially: method of exponentially weighted least squares. $ \rho $ is called the forgetting factor.

Analogous to the derivation of the LMS algorithm we find the gradient of the cost function with respect to the current weights

$\displaystyle \nabla_{\mathbf{c}} J[n] = \sum_{i=1}^{n} \rho ^{n-i} \left( - 2 ... ... 2 {\mathrm{E}}(\mathbf{x}[i]\,\mathbf{x}^{\mathsf H}[i]) \mathbf{c}[n]\right).$ (11)

We now, however, do trust in the ability to estimate the expected values $ {\mathrm{E}}(d\,\mathbf{x})=\mathbf{p}$ and $ {\mathrm{E}}(\mathbf{x}\,\mathbf{x}^{\mathsf H})=\mathbf{R}$ with sufficient accuracy using all past samples, and do not use a gradient descent method, but immediately search for the minimum of the cost function by setting its gradient to zero $ \nabla_{\mathbf{c}} J[n] = 0$. The resulting equation for the optimum filter coefficients at time $ n$ is

$\displaystyle {\boldsymbol \Phi}[n]\mathbf{c}[n]$ $\displaystyle =$ $\displaystyle \mathbf{z}[n],$  
$\displaystyle \mathbf{c}[n]$ $\displaystyle =$ $\displaystyle {\boldsymbol \Phi}^{-1}[n]\,\mathbf{z}[n],$ (12)

with $ {\boldsymbol \Phi}[n]= \sum_{i=1}^{n} \rho ^{n-i} \mathbf{x}[i]\,\mathbf{x}^{\mathsf H}[i]$, and $ \mathbf{z}[n] = \sum_{i=1}^{n} \rho ^{n-i} d^{\ast}[i]\,\mathbf{x}[i]$.

Both $ {\boldsymbol \Phi}[n]$ and $ \mathbf{z}[n]$ can be computed recursively:

$\displaystyle {\boldsymbol \Phi}[n]= \rho \,{\boldsymbol \Phi}[n-1] + \mathbf{x}[n]\,\mathbf{x}^{\mathsf H}[n],$ (13)

$\displaystyle \mathbf{z}[n] = \rho \,\mathbf{z}[n-1] + d^{\ast}[n]\,\mathbf{x}[n].$ (14)

To find the coefficient vector $ \mathbf{c}[n]$ we, however, need the inverse matrix $ {\boldsymbol \Phi}^{-1}[n]$. Using a matrix inversion lemma [1] a recursive update equation for $ \mathbf{P}[n] = {\boldsymbol\Phi}^{-1}[n]$ is found as:

$\displaystyle \mathbf{P}[n] = \rho ^{-1}\,\mathbf{P}[n-1] + \rho ^{-1}\,\mathbf{k}[n]\,\mathbf{x}[n],$      
with$\displaystyle \quad \mathbf{k}[n] = \frac{\rho ^{-1}\,\mathbf{P}[n-1]\,\mathbf{... ...]} {1 + \rho ^{-1}\,\mathbf{x}^{\mathsf H}[n]\,\mathbf{P}[n-1]\,\mathbf{x}[n]}.$     (15)

Finally, the weights update equation is

$\displaystyle \mathbf{c}[n]= \mathbf{c}[n-1] + \mathbf{k}[n] \left(d^{\ast}[n] - \mathbf{x}^{\mathsf H}[n]\mathbf{c}[n-1]\right).$ (16)

The equations to solve in the RLS algorithm at each time step $ n$ are eqs. 15 and 16. The RLS algorithm is computationally more complex than the LMS algorithm. Note, however, that due the recursive updating the inversion of matrix $ {\boldsymbol \Phi}[n]$ is not necessary (which would be a considerably higher computational load). The RLS algorithm typically shows a faster convergence compared to the LMS algorithm.