This is one of chelsea’s papers that talks about unifying RLHF for robotics. It cites some papers on how RLHF is done.