where are the expansion coefficients, i.e. the weights of the contributions of the weak classifiers in AdaBoost. are the basis functions, in AdaBoost these are the individual classifiers , where is the parametrization of the classifiers (e.g. a string describing split variables, split points and predictions of a decision tree). Additive models are fit by minimizing a loss function averaged over the training data:

In forward stagewise modeling we start with and add new basis functions sequentially, without adjusting the parameters and coefficients of those that have already been added. So at iteration we find the new expansion coefficient and the parameters of the classifier by

Show that AdaBoost.M1^{2} is equivalent to forward stagewise additive modeling using the exponential loss function
. Prove all details.

*Hints:*

- Show that you can write the average exponential loss as a weighted sum of with some weights .
- Show that the new weak classifier must minimize the weighted error rate in predicting .
- Define and express as a function of .
- Define the relationship between the variables from the proofs and in the definition of AdaBoost.M1.