Exponential Moving Average (EMA)

https://medium.com/analytics-vidhya/understanding-exponential-moving-averages-e3f020d9d13b

  • where is the new target

You see this everywhere in the Bellman Update, like for Q-Learning

  • It’s called a Semi-Gradient Algorithms An EMA model (Exponential Moving Average model) is typically used in machine learning to maintain a smoothed version of a model’s parameters over time.

We do this for

Polyak Averaging

I was introduced to this when learning DDPG.

It’s not quite the exact same as EMA above, because here we have sort of two targets. We are moving our target in the direction of , but is also continuously moving.

  • is just always lagging behind o give smoother Q-Targets to reduce variance and help stability TD-updates.