Random Forest

A random forest is Bagging on Decision Trees.

• Bagging on decision trees • Twist to add randomness/make bootstrap samples “look” more independent • Standard decision trees: When choosing which feature to split on, look at all 𝑑 features and pick the “best” one

Downside: if one feature is very informative, will be used in all $B$ datasets

Random forests: When choosing which feature to split on, look at a random subsample of $m << d$ features and pick the “best” one

Say, $m = d$
Resample for each split

🛠️ Steven Gong

Random Forest

Graph View

Backlinks