Hierarchical Behavior Cloning (HBC)

First saw this through the Robomimic paper.

They talk about Hierarchical Policy.

“HBC consists of a low-level policy that is conditioned on future observations sg ∈ S (termed subgoals) and outputs action sequences to try and achieve them, and a high-level policy that predicts future subgoals from the current observation.”