Log-Derivative Trick Remember that ∇logf=f1∇f We can rewrite this as ∇f=f∇logf This is used the derive the original Policy Gradient Methods formula