🛠️ Steven Gong

Search

Jul 21, 2025, 1 min read

Conservative Q-Learning (CQL)

Introduced to me by Jason Ma.

https://arxiv.org/pdf/2006.04779

This is how you do Offline RL without so much bias. Honestly, I’m still quite confused by it.

Graph View

Backlinks

Offline Reinforcement Learning

Created with Quartz, © 2025

Blog
LinkedIn
Twitter
GitHub