Markov Decision Process

Partially Observable Markov Decision Process

A Partially Observable Markov Decision Process is an MDP with hidden states. It is a hidden Markov model with actions.

  • i.e. When the environment is only partially observable by the agent, ex: game of poker.

Highlighted in red is what we add to the original MDP.

Definition

A POMDP is a tuple

  • is a finite set of states
  • is a finite set of actions
  • is a finite set of observations
  • is a state transition probability matrix
  • is a reward function,
  • is an observation function
  • is a discount factor,