Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning
Builds on top of Conservative Q-Learning.
I watched Sergey’s talk at RLC2024, where he talks about there’s a big gap with offline-RL and online-RL, and Cal-QL tries to close this gap.