Random Forest
A random forest is Bagging on Decision Trees.
• Bagging on decision trees • Twist to add randomness/make bootstrap samples “look” more independent • Standard decision trees: When choosing which feature to split on, look at all 𝑑 features and pick the “best” one
- Downside: if one feature is very informative, will be used in all datasets
Random forests: When choosing which feature to split on, look at a random subsample of features and pick the “best” one
- Say,
- Resample for each split