next up previous
Next: Entropy and Information Gain Up: MLA_Exercises_160106 Previous: WEKA [3 P]

Decision Trees [2 P]

In the construction of decision trees it often occurs that a mixed set of positive and negative examples remain at a leaf node. Suppose that we have $ p$ positive (class 1) and $ n$ negative (class 0) training examples:
Show that an algorithm which picks the majority classification minimizes the absolute error over the set of examples at the leaf.
Show that returning the class probability $ p / (p+n)$ minimizes the sum of squared errors.

Pfeiffer Michael 2006-01-18