Next: Entropy and Information Gain
Previous: WEKA [3 P]
In the construction of decision trees it often occurs that a mixed set of positive and negative examples remain at a leaf node. Suppose that we have
positive (class 1) and
negative (class 0) training examples:
- Show that an algorithm which picks the majority classification minimizes the absolute error over the set of examples at the leaf.
- Show that returning the class probability
minimizes the sum of squared errors.