🛠️ Steven Gong

Search

Reinforcement Learning Metrics
Qualitative Metrics

Sep 01, 2025, 1 min read

Reinforcement Learning Metrics

Debugging RL is really hard. In supervised learning, you can just look at loss and validation curves. But what do we look at for RL?

Actually, the Q chunking codebase has a good handful of metrics that they actually calculate for you, which you can see yourself

And important skill is to debug overfitting / non-overffiting.

For example, our task has 0 success rate in offline RL. Why is the task 0% success rate during eval?

Prove state distribution mismatch
- Plot the distribution of states in the offline dataset
- See the data distribution in the online RL setting

Offline-to-online RL

Prove unlearning

Metrics:

Q variance Q Variance

Efficient Online Reinforcement Learning FineTuning Need Not Retain Offline Data

Qualitative Metrics

What Matters for Batch Online Reinforcement Learning in Robotics

Graph View

Backlinks

No backlinks found

Created with Quartz, © 2025

Blog
LinkedIn
Twitter
GitHub