Training into multiple chunks of Test-Validation Splits to get correct model accuracy, reducing overfits.
1. K - Fold CV
Break down Test-Validation Sets into "k" chunks.
Example:
Here, Size of dataset (rows) = 500 If k = 5
Test size = (500/5) = 100
- Take 1st 100 rows as test, remaining as train
- Take 2nd 100 rows as test, remaining as train, and so on…
- Finally take averages of model score accuracy of all Chunks / Folds
NoteIn CVs, one model does not have the idea about the other -> to avoid overfitting, different models are used, and their accuracies are averaged.
2. Time Series CV
Acts on a Time Series Dataset
3. Grid Search CV
Trying out all possible combinations of hyper-parameters to train model & see which one suits the use case
4. Randomized CV
Iterate through random sets of hyper-parameters
Tip
- Grid & Randomized CVs are used for Hyperparameters
- K - Fold & Stratified K - Folds are used for Training
Hence, they can be combined and used