Conservative Q-Learning (CQL)
Introduced to me by Jason Ma.
https://arxiv.org/pdf/2006.04779
This is how you do Offline RL without so much bias. Honestly, I’m still quite confused by it.
Introduced to me by Jason Ma.
https://arxiv.org/pdf/2006.04779
This is how you do Offline RL without so much bias. Honestly, I’m still quite confused by it.