Cross Validation

Cross-validation is a method for Hyperparameter Tuning, oftentimes used when the dataset is small. Typical number of folds are 3-fold, 5-fold or 10-fold cross-validation.

To complete a -fold cross validation:

  • Split the dataset in different equal folds
  • Use one of the folds as the validation set and others as training set
  • Repeat times (each time changing the validation fold) and average performance

Intuition

A single train/val split wastes data (the validation fold never trains a model) and the estimate is noisy (one unlucky split can mislead you). Cross-validation reuses every point as both training and validation across different folds, so the averaged score has lower variance. The cost: you train models instead of one.

In practice

People prefer to avoid cross-validation in favor of having a single validation split, since cross-validation can be computationally expensive.

On Dataset splits, people tend to use 50%-90% of the training data for training and the rest for validation:

  • If the number of hyperparameters is large, use bigger validation splits
  • If the number of examples in the validation set is small (~few hundred or so), it is safer to use cross-validation