Prove Corollary 1.1 (p. 7) from the script Theory of Reinforcement Learning 3:
For every policy
there exists a deterministic policy
such that
. As a special case: If there exists a stochastic optimal policy
, then there exists also a deterministic optimal policy
such that
.