Discount Factor

Used in the Markov Decision Process. Why does this exist?

Mathematically convenient to discount rewards. Avoids infinite returns
Uncertainty about the future
Animal / human behavior

Undiscounted Markov reward processes exist.

Discount factor punishes you for being slow. The lower the discount factor value, the more you are saying that values in the future don’t matter as much as the values in the present

$γ = 0.99$ vs $γ = 0.1$

🛠️ Steven Gong

Discount Factor

Graph View

Backlinks