High-Dimensional Continuous Control Using Generalized Advantage Estimation

Heard from the spinning up

There are a few that we could choose:

https://arxiv.org/pdf/1506.02438