Greedy in the Limit with Infinite Exploration
Greedy in the Limit with Infinite Exploration (GLIE)
All state-action pairs are explored infinitely many times,
The policy converges on a greedy policy,
I initially undersold how important this is, but this is EXTREMELY important to understand.
We use this GLIE idea for Monte-Carlo Control.