Breakdown of algorithm
Breakdown of algorithm
-
Divide and Conquer
-
Given a training set with classes and a homogeneity measure.
-
Recursively split the set into subsets with less homogeneity until all subsets are homogeneous.
-
recursively dividing the dataset based on feature tests until it creates a tree structure that represents the underlying patterns in the data.
-
|D|: Represents the number of instances in the dataset
-
: For all instances in the dataset .
-
: Represents a class label.
-
Return leaf with default class: If the dataset is empty, it returns a leaf node with a default class.
-
Return leaf with class label , containing D: If the dataset is homogeneous (all instances have the same class), it returns a leaf node with the class label and includes the dataset in that leaf.
-
Select a test based on a single input variable: This step involves selecting a feature and a corresponding condition to split the dataset into subsets.
-
Split D into : The dataset is split into subsets based on the selected test, where is the number of outcomes.
-
for to 0 do INDUCETREE(): Recursively apply the INDUCETREE function to each subset .