Training into multiple chunks of Test-Validation Splits to get correct model accuracy, reducing overfits.

1. K - Fold CV

Break down Test-Validation Sets into "k" chunks.

Example: Pasted image 20240818141003.png

Here, Size of dataset (rows) = 500 If k = 5

Test size = (500/5) = 100

Take 1st 100 rows as test, remaining as train
Take 2nd 100 rows as test, remaining as train, and so on…

Finally take averages of model score accuracy of all Chunks / Folds

Note

In CVs, one model does not have the idea about the other -> to avoid overfitting, different models are used, and their accuracies are averaged.

2. Time Series CV

Pasted image 20240818142005.png

Acts on a Time Series Dataset

3. Grid Search CV

Pasted image 20240904171441.png

Trying out all possible combinations of hyper-parameters to train model & see which one suits the use case

4. Randomized CV

Pasted image 20240904171701.png

Iterate through random sets of hyper-parameters

Tip

Grid & Randomized CVs are used for Hyperparameters

K - Fold & Stratified K - Folds are used for Training

Hence, they can be combined and used