Decision Tree

Random Forest

A random forest is Bagging on Decision Trees.

• Bagging on decision trees • Twist to add randomness/make bootstrap samples “look” more independent • Standard decision trees: When choosing which feature to split on, look at all 𝑑 features and pick the “best” one

  • Downside: if one feature is very informative, will be used in all datasets

Random forests: When choosing which feature to split on, look at a random subsample of features and pick the “best” one

  • Say,
  • Resample for each split