Computational Intelligence, SS08
2 VO 442.070 + 1 RU 708.070 last updated:
General
Course Notes (Skriptum)
Online Tutorials
Practical Course Slides
Homework
Assignments
Scores
Guidelines
Archive
Exams
Animated Algorithms
Interactive Tests
Key Definitions
Downloads
Literature and Links
News
mailto:webmaster

Homework 48: Learning Algorithms in Weka



[Points: 12.5; Issued: 2007/05/08; Deadline: 2007/06/12; Tutor: Ilir Ademi; Infohour: 2007/06/08, 15:15-16:15, HSi11; Einsichtnahme: 2007/06/22, 15:15-16:15, HSi11; Download: pdf; ps.gz]





Comparing different learning algorithms [7 points]

This homework assignment asks you to compare the performance of three learning algorithms on different data sets by using the WEKA toolkit1. Choose two data sets from the archive contained in the file datasets.zip. Compare the following four algorithms





Category WEKA implementation
Majority/Average predictor ZeroR
Decision trees j48.J48
Instance based learning IBk
Support vector machines SMO




Use the pruned (set unpruned to false) version of the decision trees and at least 3 different kernels for the support vector machines.
  • Choose an appropriate method to compare the algorithms.
  • Explain what you did so that the results can be reproduced by everyone.
  • Present your results clearly, structured and legibly.
  • State for each data set which learning algorithm you would recommend and explain why. Note: Consider not only the error on the test set, but also criteria as for instance the time for learning, interpretability of the hypothesis etc.

Decision Trees [5.5 points]

Use the breast-cancer.arff dataset from the datasets.zip file. Apply the J48 algorithms for decision trees with various settings to the dataset. As evaluation method use a 10 fold cross-validation.
  • Apply the J48 algorithms with different values (1 to 9, you can use a stepsize of 2) of the parameter minNumObj (specifies the minimum number of examples contained in a leaf), use the unpruned version of the decision trees (set the parameter unpruned to true).
  • Create a plot which shows the error on the training set and the cross-validation error for the different values of minNumObj. Also create a plot which shows the size of the tree in dependence of minNumObj. Interpret your results.
  • How do the results change when using pruned decision trees (set unpruned to false)? Interpret the result, compare the size of the pruned and unpruned trees.




Fußnoten

... toolkit1
See the links section on the CI homepage for further information and tutorials about WEKA.