Maschinelles Lernen B, WS 2004
Institut für Grundlagen der Informationsverarbeitung (708) last changes: 01.02.05

Course Contents:

Machine learning methods are presented which allow artificial systems to learn successful action policies. The artificial agent could be a robot, a character in a computer game or an Internet browser. In general there is no teacher available, who could tell the agent which action would be optimal in a given situation. Instead, the agent just gets occasional "rewards" or "punishments", and has to find out on his own how much each action of a sequence contributed to a reward. From this information the agent has to develop efficient strategies for future tasks. Applications will be shown that mainly demonstrate learning algorithms for humanoid robots (on which the institute is currently working in a joint research project together with EPFL Lausanne) and for quasi-commercial computer games.

Reinforcement Learning algorithms ( have been particularly successful for solving problems of this kind. Therefore we will concentrate on this learning approach during the lecture and discuss both the theoretical background (dynamic programming, Markov decision processes) and applications.

In this lecture we will also cover genetic algorithms (often called evolutionary algorithms, see, which is another interesting approach to machine learning of successful policies. Here the computer simulates evolution by randomly mutating and crossing-over different promising strategies. The "fittest" of the newly generated policies are selected and evolution proceeds on this new population.

Aim of the Course:

The goal is to give an overview of machine learning methods for artificial agents, and to apply such methods for the solution of exercise problems.

Teaching Methods:

Students are expected to learn how to solve autonomous learning problems, and which algorithms are suitable for what kind of application.

Course Prerequisites:

Computational Intelligence, Statistical Methods


This page lists all updates of this course homepage. It will be kept up-to-date during the semester.

01.02.05 The results of the test and the final grades are now online. Happy holidays!
26.01.05 The results for the 3rd problem set are now online. The last presentation hour will take place on Friday, January 26. There also topics for projects and diploma theses are presented if you are interested in continuing with machine learning.
26.01.05 The fastest times for the sail challenge can be found here.
14.01.05 Please register via TUGOnline for the test (1st February) until 28th of January.
14.01.05 The examples for the GA toolbox that were presented in the lab session are now available for download here.
10.01.05 Several new links to material on genetic algorithms has been added to the Literature and Links sections.
10.01.05 The next lab session will take place on Friday, January 14th.
23.12.04 The simulation environment for the sail-challenge task (task 3) is now available here:
14.12.04 The third and last problem set is now available. The environment files for task 3 will be made available soon.
25.11.04 An alternative program for exercise 2 is available here: You can use also this program instead of the Multipole-Tutorial that comes with the RL-Toolbox. Please indicate on your solution which program you used to solve this task.
12.11.04 The next lab session will be on Friday November 26th.
12.11.04 The new version of the Reinforcement Learning Toolbox is now available at There you will also find demos presented during the tutorial.
29.10.04 The next lab session will be on Friday November 5th. This will be a tutorial for the Reinforcement Learning Toolbox that you will use for Problem Set 2. The first presentation hour will take place on November 12th, 13:15 - 14:45.
27.10.04 A compilation of all the theorems and corollaries that were covered in lecture 3 is now available here and in the section Course Material.
27.10.04 A reminder: The next lab session will take place Friday, 29th October at 13:15 in HS i11. Use this hour to ask any questions regarding problem set 1.
19.10.04 Since there seems to be a problem with the website of the Sutton and Barto book, I have added a link to a full PDF version of the book in the Course Material section. The direct link is
18.10.04 The first problem set is now available here. Solutions have to be submitted before November 2nd 2004.
08.10.04 There will be no practicals on 15th and 22nd of October. The next practicals will take place on October 29th.
05.10.04 IMPORTANT: From next week on (12.10.04) the lecture will take place in Seminarraum IGI, Inffeldgasse 16b, 1st floor. The practicals will still be held in lecture hall i11.
04.10.04 This homepage is created. I hope you will make use of the services that we offer you here. If you have any suggestions or complaints concerning this homepage please send us an e-mail.


On this site you will find the problem sets for the practicals.

Problem Sets

Nr.IssuedSubmissionLinkAdditional Material
1 18.10.04 02.11.04 Problem Set 1  
2 02.11.04 14.12.04 Problem Set 2
Alternative Program for Exercise 2
3 14.12.04 25.01.05 Problem Set 3 Simulation environment for Exercise 3 GA Toolbox
Sail Challenge Hall of Fame

Please post your questions concerning the problem sets to the MLB Newsgroup, or send them directly to Michael Pfeiffer.

People Involved

This course is being organized by Institut für Grundlagen der Informationsverarbeitung, Inffeldgasse 16b/1. Stock, A-8010 Graz.

Lecturers / Instructors


If you have any questions or problems, please do not hesitate to contact one of the above persons.

Place and Date


Time: Tuesday, 14.15-15.45
First lecture: Tuesday, 4th October 2004
Lecture Hall: Seminarraum IGI, Inffeldgasse 16b, 1st floor

Lab Sessions

Time: Friday, 13.15-14.00
First practical: Friday, 8th October 2004
Lecture Hall: i11

Timetable for Lab Sessions

108.10.04 Organization, Exploration/Exploitation Dilemma
229.10.04 Problem Set 1, Value Functions
305.11.04 Tutorial for Reinforcement Learning Toolbox
412.11.04 First Presentation Hour (13:15 - 14:45)
526.11.04 Problem Set 2
610.12.04 Problem Set 2
717.12.04 Second Presentation Hour (13:15 - 14:45)
814.01.05 Genetic Algorithms
928.01.05 Third Presentation Hour

Further Reading:

Genetic Algorithms:

Course Material

Software | Lecture Notes | Literature | Tutorials


Lecture Notes

LectureDateTopicCourse Material
105.10.04 Introduction to Reinforcement Learning Slides from Sutton / Barto:
212.10.04 The Reinforcement Learning Problem Slides from Sutton / Barto:
319.10.04 Theory of Reinforcement Learning 1/2 Theorems (PDF)
402.11.04 Theory of Reinforcement Learning 2/2  
509.11.04 Temporal Difference Learning, Eligibility Traces Slides from Sutton / Barto:
616.11.04 Adaptive Control, Humanoid Robots  
723.11.04 Function Approximation in RL Slides from Sutton / Barto:
830.11.04 Hierarchical Reinforcement Learning Papers:
907.12.04 Policy Gradient RL Papers:
1014.12.04 RL for Motor Control Slides:
1111.01.05 Evolutionary Algorithms  
1218.01.05 Learning and Evolution, GA in nature

Slides from Practicals

108.10.04 Organization, Exploration/Exploitation Dilemma (PDF)
229.10.04 Problem Set 1, Value Functions (PDF)
305.11.04 Tutorial: Reinforcement Learning Tutorial Old Tutorial (PDF)
412.11.04 Presentation Problem Set 1  
526.11.04 Problem Set 2, On- / off-policy Learning, Self-play Learning (PDF)
610.12.04 Problem Set 2, Policy Gradient RL (PDF)
717.12.04 Presentation Problem Set 2  
814.01.05 Genetic Algorithms and Toolbox

Literature and Slides

* Thanks to Prof. Andrew G. Barto for providing his lecture slides.


Sources for Scientific Literature

Reinforcement Learning

Genetic Algorithms