# Stochastic/Markov Game

Stochastic Games generalize MDPs to multiple interacting decision-makers.

Mentioned in F1TENTH Research Proposal.

Formalization

A

Markov Gameis a tuple $⟨S,A,P,R,γ⟩$

- $S=S_{1}×⋯×S_{N}$ where $S_{i}$ is a finite set of states for player $i$
- $A=A_{1}×⋯×A_{N}$ where $A_{i}$ is a finite set of actions for player $i$
- $P:S×A→S$ is a state transition probability matrix
- $R=R_{1}×⋯×R_{N}$ is a reward function, where $R_{i}:S×A×S→R$
- $γ$ is a discount factor, $γ∈[0,1]$

Value Function $V_{π_{i},π_{−i}}(s):=E[∑_{t≥0}γ_{t}R_{i}(s_{t},a_{t},s_{t+1})∣a_{t}∼π_{i}(⋅∣s_{t}),s_{0}=s]$

**Nash Equilibrium**
A Nash Equilibrium of the Markov game $(N,S,{A_{i}}_{i∈N},P,{R_{i}}_{i∈N},γ)$ is a joint policy $π_{∗}=(π_{1,∗},…,π_{N,∗})$, such that for any $s∈S$ and $i∈N$
$V_{π_{i,∗},π_{−i,∗}}(s)≥V_{π_{i},π_{−i,∗}}(s),for anyπ_{i}$

- where $−i$ represents the indices of all agents in N except agent $i$.

Resources

- Turn-based markov game formalization: https://arxiv.org/pdf/2002.10620.pdf