Partially Observable Markov Decision Process
A Partially Observable Markov Decision Process is an MDP with hidden states. It is a hidden Markov model with actions.
- i.e. When the environment is only partially observable by the agent, ex: game of poker.
Highlighted in red is what we add to the original MDP.
Definition
A POMDP is a tuple
- is a finite set of states
- is a finite set of actions
- is a finite set of observations
- is a state transition probability matrix
- is a reward function,
- is an observation function
- is a discount factor,