n-step Reinforcement Learning
Really had to deeply understand this as I started reading the Reinforcement Learning with Action Chunking paper, where they talked about bias.
https://gibberblot.github.io/rl-notes/single-agent/n-step.html
You can do n-step RL for lots of things. But in RL, we generally do 1-step RL.
Not the same thing as td lambda!!!
TODO: look into the actual differences