Bootstrapping and Sampling
bootstrapping in RL means that you update a value based on some estimates and not on some exact values.
“TD learning methods update targets with regard to existing estimates rather than exclusively relying on actual rewards and complete returns as in MC methods. This approach is known as bootstrapping.” Source
Bootstrapping: the update involves an estimate
- DP Bootstraps
- MC does not boostrap
- TD Boostraps
Sampling: the update samples an expectation
- DP does not sample
- MC Samples
- TD Samples
Bootstrapping main ideas: https://www.youtube.com/watch?v=Xz0x-8-cgaQ