Policy Gradient Methods Proximal Policy Optimization § Resources Lecture 4: TRPO, PPO from Deep RL Foundations, slides here https://lilianweng.github.io/posts/2018-04-08-policy-gradient/ https://openai.com/blog/openai-baselines-ppo/ https://arxiv.org/pdf/1707.06347.pdf