🛠️ Steven Gong

Search

Oct 16, 2025, 1 min read

Distribution Shift

This is a very common term in the robotics world.

In imitation learning, the policy is trained on demonstrations from an expert.

Training data consists of states visited by the expert → (s, a_expert).
But when the learned policy is deployed, it doesn’t act exactly like the expert — it makes small mistakes.
These mistakes push the agent into new states the expert never visited, meaning the learner faces inputs it never saw in training.
This mismatch between training distribution (expert’s states) and test distribution (learner’s states) is the distribution shift.

Result: errors compound over time → performance collapses.

This initial diagram was created while I was reading ALOHA, also mentioned in my World Model + RL.

Graph View

Backlinks

No backlinks found

Created with Quartz, © 2025

Blog
LinkedIn
Twitter
GitHub