Next: RL theory II [3 Up: MLB_Exercises_2010 Previous: Genetic Algorithm [3* P]

RL theory I [3 P]

Prove Corollary 1.1 (p. 7) from the script Theory of Reinforcement Learning ³:

For every policy $\pi$ there exists a deterministic policy $\pi'$ such that $\pi' \geq \pi$ . As a special case: If there exists a stochastic optimal policy $\pi$ , then there exists also a deterministic optimal policy $\pi'$ such that $\pi' \geq \pi$ .

Haeusler Stefan 2011-01-25