You will regret no matter what you do. The reason is that you are human.
Hang yourself or not hang ourself, you will regret both. That is the essence of human philosophy.
I am afraid of regretting to not live at my fullest potential.
The regret is the opportunity loss for one step: Better notation:
The total regret is the total opportunity loss: where is the average reward up to timestep .
In Game Theory, the regret of not having chosen an action is the difference between the utility/reward of that action and the action we actually chose, with respect to the fixed choices of other players.
Regret is useful because it helps us understand how well an algorithm would do. With Epsilon-Greedy, we have linear regret (think about why). This is why Epsilon-Greedy is a naive algorithm, and we prefer other algorithms.
Optimistic Greedy has optimistic initialization (we initialize with the highest reward)
From CFR page: Regret is a measure of how much one regrets not having chosen an action. It is the difference between the utility/reward of that action and the action we actually chose, with respect to the fixed choices of other players.
The overall average regret of player at time is
- where is the strategy used by player on round
- is utility of a strategy profile with and