Maximum a Posteriori Policy Optimisation (MPO)
Mentioned alongside AWR, SAC in the CoAdaptation of Algorithmic and Implementational Innovations in Inferencebased Deep Reinforcement Learning paper.
Mentioned alongside AWR, SAC in the CoAdaptation of Algorithmic and Implementational Innovations in Inferencebased Deep Reinforcement Learning paper.