🛠️ Steven Gong

Search

Feb 11, 2026, 1 min read

Double Q-Learning

This was invented to avoid Maximization Bias, so we can have an unbiased estimator.

Pseudocode

Graph View

Backlinks

Maximization Bias
Q-Learning

Created with Quartz, © 2026

Blog
LinkedIn
Twitter
GitHub