## Dynamic Programming in Reinforcement Learning

Dynamic programming is used to solve many other problems, e.g.

- Scheduling algorithms
- String algorithms (e.g. sequence alignment)
- Graph algorithms (e.g. shortest path algorithms)
- Graphical models (e.g. Viterbi algorithm)
- Bioinformatics (e.g. lattice models)

Dynamic sequential or temporal component to the problem Programming optimising a â€śprogramâ€ť, i.e. a policy

- c.f. linear programming

A method for solving complex problems

- By breaking them down into subproblems
- Solve the subproblems
- Combine solutions to subproblems

Policy Iteration Value Iteration

### Extensions

- In-place dynamic programming
- Prioritized sweeping
- Use a priority queue

- Real-time dynamic programming

We will consider sample backups.