Double Q-Learning This was invented to avoid Maximization Bias, so we can have an unbiased estimator. Pseudocode