In decision tree learning, which criterion is primarily used to determine the optimal split of a node, reducing node impurity?
Variance reduction of the target variable
Number of samples in each split
Overlook minor misbehaviors
Impose harsh punishments for any infraction

Machine Learning Exercises are loading ...