|
|
My Master Thesis "The Reinforcement Learning Toolbox, Reinforcement
Learning for optimal control tasks" is finished since June 2005, it
contains a comprehensive description of the class system of the RL
toolbox. The use of RL for optimal control tasks is explained and
many different algorithms are introduced. These algorithms are
among others continuous time RL, continuous actor critic learning,
Residual and Residual Gradient algorithms, Policy Search algorithms
like CONJMDP and PEGASUS. These algorithms are tested and compared
on three different benchmark tasks which are the Pendulum,
Cart-Pole and Acrobot swing up task. You can download the thesis
here.
|
|