Advantage-Weighted Regression (AWR)

Saw this from the online batch RL paper.

https://arxiv.org/abs/1910.00177