🛠️ Steven Gong

Search

Sep 01, 2025, 1 min read

Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems

Good survey of the open problems.

off-policy policy gradient is a thing via Importance Sampling.

Graph View

Backlinks

Distributional Shift
Offline Reinforcement Learning
Efficient Online Reinforcement Learning Fine-Tuning Need Not Retain Offline Data

Created with Quartz, © 2026

Blog
LinkedIn
Twitter
GitHub