Conservative Q-Learning for Offline Reinforcement Learning (CQL)
Introduced to me by Jason Ma.
This is how you do Offline RL without so much bias. Honestly, I’m still quite confused by it.
Introduced to me by Jason Ma.
This is how you do Offline RL without so much bias. Honestly, I’m still quite confused by it.