Increasing the step size
generally results in faster
convergence of the LMS algorithm.

The goal of system identification is
to build a model of an unknown system.

The RLS algorithm usually converges
faster than the LMS algorithm.

Why is it (usually) not desirable to
achieve the global minimum of the mean squared error (of the whole
timeseries) for an adaptive filter?

An adaptive filter trained using the
RLS algorithm with a forgetting factor
