Cross Validation

Cross-validation is a method for Hyperparameter Tuning, oftentimes used when the dataset is small. Typical number of folds are 3-fold, 5-fold or 10-fold cross-validation.

To complete a $k$ -fold cross validation:

You split the dataset in $k$ different equal folds
Use one of the the folds as the validation set and others as training set
Repeat $k$ times (each time changing the validation fold) and average performance

In practice, people prefer to avoid cross-validation in favor of having a single validation split, since cross-validation can be computationally expensive.

Dataset The splits people tend to use is between 50%-90% of the training data for training and rest for validation.

If the number of hyperparameters is large, use bigger validation splits
If the number of examples in the validation set is small (~few hundred or so), it is safer to use cross-validation.

🛠️ Steven Gong

Cross Validation

Graph View

Backlinks