Bootstrapping and Sampling

bootstrapping in RL means that you update a value based on some estimates and not on some exact values.

“TD learning methods update targets with regard to existing estimates rather than exclusively relying on actual rewards and complete returns as in MC methods. This approach is known as bootstrapping.” Source

Bootstrapping: the update involves an estimate

  • DP Bootstraps
  • MC does not boostrap
  • TD Boostraps

Sampling: the update samples an expectation

  • DP does not sample
  • MC Samples
  • TD Samples

Bootstrapping main ideas: