Maschinelles Lernen B, WS 2006
Institut für Grundlagen der Informationsverarbeitung (708) last changes:

Course Contents:

Machine learning methods are presented which allow artificial systems to learn successful action policies. The artificial agent could be a robot, a character in a computer game or an Internet browser. In general there is no teacher available, who could tell the agent which action would be optimal in a given situation. Instead, the agent just gets occasional "rewards" or "punishments", and has to find out on his own how much each action of a sequence contributed to a reward. From this information the agent has to develop efficient strategies for future tasks. Applications will be shown that mainly demonstrate learning algorithms for humanoid robots (on which the institute is currently working in a joint research project together with EPFL Lausanne) and for quasi-commercial computer games.

Reinforcement Learning algorithms (http://reinforcementlearning.ai-depot.com/Main.html) have been particularly successful for solving problems of this kind. Therefore we will concentrate on this learning approach during the lecture and discuss both the theoretical background (dynamic programming, Markov decision processes) and applications.


Discussed Topics:

Genetic Algorithms

In this lecture we will also cover genetic algorithms (often called evolutionary algorithms, see http://www.aic.nrl.navy.mil/galist/), which is another interesting approach to machine learning of successful policies. Here the computer simulates evolution by randomly mutating and crossing-over different promising strategies. The "fittest" of the newly generated policies are selected and evolution proceeds on this new population.


Discussed Topics

News

This page lists all updates of this course homepage. It will be kept up-to-date during the semester.

13.03.07 The next date for the (oral) MLB exam is Wednesday May 2nd 2007.
05.03.07 The projects are evaluated and the final grades for the KU are available here.
19.02.07 The next date for the (oral) MLB exam is Tuesday March 13th 2007.
11.02.07 The final results for the KU are available here. The grades for the KU are still preliminary, because project points are not included. The grade 2* means that you have enough points for grade 1 if you submit a project report. The grades for VO exam are final.
29.01.07 The results of the exam on January 29th 2007 are online at the results page. The grades are still preliminary, because you can gain more bonus points for problem set 3.
18.01.07 A list of material that is relevant for the MLB exam is available here.
31.12.06 The results for problem set 2 are available here.
12.12.06 The written exam for MLB will take place on Monday, January 29th 2007, from 13:15-14:45.
12.12.06 The third and last problem set is available here. Submit your solutions by January 30th 2007.
04.12.06 The submission date for problem set 2 is now Monday December 18th. The KU class on December 12th is therefore cancelled.
01.12.06 The suggested topics for practical projects are now available online here.
27.11.06 There will be an additional KU class on Tuesday December 5th, where the project topics will be presented.
08.11.06 The results for tasks 1-4 from problem set 1 are available here. The results for the bonus exercise 5* will follow shortly.
07.11.06 The second problem set is available here. Submit your solutions by December 11th.
06.11.06 You can download the Reinforcement Learning Toolbox from www.igi.tugraz.at/ril-toolbox/general/overview.html. A short tutorial is available here, and more information can be found at the Toolbox homepage.
10.10.06 The first problem set for Machine Learning B is available here. Submit your solutions by October 30th.
09.10.06 The script for the reinforcement learning theory part is available for download at the course material section.
22.09.06 Important Notice for Telematics students: Since Machine Learning A is not offered this semester, Machine Learning B can instead be attended and counts as a core course for the "Computational Intelligence" catalog.
24.08.06 This homepage is created. I hope you will make use of the services that we offer you here. If you have any suggestions or complaints concerning this homepage please send me an e-mail.


Tasks

On this site you will find the problem sets and projects for the practicals.

Problem Sets

Nr.IssuedSubmissionLinkAdditional Material
1 10.10.2006 30.10.2006 Problem Set 1  
2 07.11.2006 18.12.2006 Problem Set 2  
3 12.12.2006 30.01.2007 Problem Set 3 Truck Backer-Upper Environment


Projects

IssuedSubmissionLinkAdditional Material
01.12.2006 25.02.2007 Projects


Please post your questions concerning the problem sets to the MLB Newsgroup, or send them directly to Michael Pfeiffer.

People Involved

This course is being organized by Institut für Grundlagen der Informationsverarbeitung, Inffeldgasse 16b/1. Stock, A-8010 Graz.

Lecturers / Instructors


Office

If you have any questions or problems, please do not hesitate to contact one of the above persons.


Place and Date

Lectures

Time: Monday, 13.15-14.45
First lecture: Monday, 2nd October 2006
Lecture Hall: i11, Inffeldgasse 16b


Lab Sessions

Time: Tuesday, 17.15-18.00
Lecture Hall: i11, Inffeldgasse 16b
Important notice: There will only be a few tutorial sessions in this time slot, most of the work for the KU will be done at home. Look at the timetable below to see when the next tutorial session will be held. Tutorials will be announced in the lectures, on this webpage and in the newsgroup.

Timetable for Tutorials

LectureDateTopic
110.10.2006 Organisation, Exercise Sheet 1
231.10.2006 Solutions for Exercise Sheet 1
307.11.2006 Exercise Sheet 2, RL Toolbox Tutorial
414.11.2006 RL Toolbox Tutorial Part 2
505.12.2006 Projects Presentation
Cancelled12.12.2006 Exercise Sheet 3, Solutions for Exercise Sheet 2
609.01.2007 Genetic Algorithms Toolbox


Reinforcement Learning:



Books

Papers

Dynamic Programming

Hierarchical RL

Policy Gradient RL

Applications of RL in Robotics


Imitation Learning


Genetic Algorithms:



Books

Papers



Course Material

Software | Lecture Notes | Literature | Tutorials

Software


Lecture Notes

LectureDateTopicCourse Material
102.10.2006 Introduction
209.10.2006 Markov Decision Processes
316.10.2006 Optimal Value Functions
423.10.2006 Dynamic Programming, Monte Carlo Methods
530.10.2006 Temporal Difference Learning
606.11.2006 Eligibility Traces
713.11.2006 Function Approximation
820.11.2006 Model-based RL, Hierarchical RL
927.11.2006 Policy Gradient Methods
1004.12.2006 RL in Robotics, Imitation Learning
1111.12.2006 Biological Hypotheses about RL
1208.01.2007 Genetic Algorithms
1315.01.2007 Advanced Topics in Genetic Algorithms
1422.01.2007 Neuro-Evolution of Augmenting Topologies
1529.01.2007 Written Exam  


Slides from Practicals

LectureDateTopicSlides
110.10.2006 Organisation, First Exercise Sheet
231.10.2006 Solutions of First Exercise Sheet  
307.11.2006 Second Exercise Sheet, Tutorial for RL Toolbox
414.11.2006 Tutorial for RL Toolbox
505.12.2006 Presentation of Projects
609.01.2007 Tutorial for GA Toolbox
723.01.2007 Preparation for MLB Exam



Literature and Slides



* Thanks to Prof. Andrew G. Barto for providing his lecture slides.


Tutorials




Sources for Scientific Literature


Reinforcement Learning


Genetic Algorithms