# Optimal Policy

Define a partial ordering over policies $π≥π_{′}ifv_{π}(s)≥v_{π_{′}}(s),∀s$

Theorem

For any MDP,

- There exists an optimal policy $π_{∗}$ that is better than or equal to all other policies, $π_{∗}≥π,∀π$
- All optimal policies achieve the Optimal Value Function, $v_{π_{∗}}(s)=v_{∗}(s)$
- All optimal policies achieve the optimal action-value function, $q_{π_{∗}}(s,a)=q_{∗}(s,a)$

How to arrive at $q_{∗}$ values? We need to use Bellman Equation, central to solving these.

See Dynamic Programming.