Seminar Computational Intelligence A (708.111)
für Grundlagen der Informationsverarbeitung (708)
Assoc. Prof. Dr. Robert Legenstein
Office hours: by appointment (via e-mail)
IGI-seminar room, Inffeldgasse 16b/I, 8010 Graz
on Tuesday, Oct 3 2017, 15:15 - 17.00 p.m. (TUGonline)
Content of the seminar: Learning to Learn
"To illustrate the utility of learning to learn,
it is worthwhile to compare machine learning to human learning.
Humans encounter a continual stream of learning tasks. They do
not just learn concepts of motor skills, they also learn bias,
i.e., they learn how to generalize. As a result, humans are
often able to generalize correctly from extremely few examples -
often just a single example suffices to teach us a new thing. "
[Thrun, S., & Pratt, L. (Eds.). Learning to learn.
In this seminar, we will discuss novel work on
"learning to learn". This area of machine learning deals with
the following question: How can one train algorithms such that
they acquire the ability to learn?
The seminar continues the
discussion of last year's CI Seminar B, but is designed as a
stand alone course, i.e., students are not expected to have
visited the previous seminar. However, basic knowledge in
neural networks is expected (e.g., the computational
inteligence lecture) and basic knowledge in reinforcement
lerning would be beneficial.
How to prepare and hold your talk:
The guide presented in the seminar: How to prepare and hold your talk
Lake, B. M., Ullman, T. D.,
Tenenbaum, J. B., & Gershman, S. J. (2016). Building
Machines that learn and think like people.
Only parts of it should be discussed, e.g.
parts of Sections 3 and 4. It has in Section 4 also an
introduction to learning to learn.
Greff, K., Srivastava, R. K., Koutník, J., Steunebrink, B. R., & Schmidhuber, J. (2016). LSTM: A search space odyssey. IEEE transactions on neural networks and learning systems.
The goal of this talk is to
introduce LSTMs and its variants. Skip parts of the
evaluations if necessary.
- Williams, R. J. (1992). Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning, 8(3-4), 229-256.
In this talk, the REINFORCE algorithm should
be introduced after a very basic introduction into
- Mnih, V., Badia, A. P., Mirza, M., Graves, A.,
Lillicrap, T. P., Harley, T., ... & Kavukcuoglu, K. (2016,
February). Asynchronous methods for deep reinforcement
learning. In International Conference on Machine Learning.
Asynchronous Advantage Actor Critic algorithm used in
some papers below.
- Zoph, B., & Le, Q. V. (2016). Neural architecture search
with reinforcement learning. arXiv preprint arXiv:1611.01578.
Describes how network
arcitectures can be learned with reinforcement learning.
Learning to Learn for Reinforcement Learning
Wang, J. X., Kurth-Nelson, Z.,
Tirumala, D., Soyer, H., Leibo, J. Z., Munos, R., ... &
Botvinick, M. (2016). Learning to reinforcement learn. PDF
Duan, Y., Schulman,
J., Chen, X., Bartlett, P. L., Sutskever, I., &
Abbeel, P. (2016). RL $^ 2$: Fast Reinforcement
Learning via Slow Reinforcement Learning. PDF.
additional topic: TRPO
Trust Region Policy Optimization , since it is used here
(but quite technical).
Braun, D. A., Aertsen, A., Wolpert, D. M., & Mehring,
C. (2009). Motor task variation induces structural
learning. Current Biology, 19(4), 352-357. PDF
Presents results of a behavioral experiment
which studied learning-to-learn in human motor
control. This is modeled in (Weinstein et al., 2017) below.
Weinstein, A., & Botvinick, M. M. (2017). Structure Learning in Motor Control: A Deep Reinforcement Learning Model. arXiv preprint arXiv:1706.06827. PDF
Models the results of Braun et
al. (2009) above using model-based reinforcement learning.
Learning learning rules
Andrychowicz, M., Denil, M., Gomez, S., Hoffman,
M. W., Pfau, D., Schaul, T., & de Freitas,
N. (2016). Learning to learn by gradient descent by
gradient descent. In Advances in Neural Information Processing Systems (pp.
recurrent neural network to propose parameter update of
another neural network.
- Li, K., & Malik, J. (2016). Learning to optimize. arXiv preprint arXiv:1606.01885.
Describes learning of an
Learning to learn from few examples
- Li, Z., Zhou, F., Chen, F., & Li, H. (2017). Meta-SGD: Learning to Learn Quickly for Few Shot Learning. arXiv preprint arXiv:1707.09835.
Shows how a stochastic gradient
descent (SGD) learner can be learned for few-shot learning
- Santoro, A., Bartunov, S., Botvinick, M., Wierstra, D., & Lillicrap, T. (2016). One-shot learning with memory-augmented neural networks. arXiv preprint arXiv:1605.06065.
- Ravi, S., & Larochelle, H. (2016). Optimization as a model for few-shot learning.
- Vinyals, O., Bengio, S., & Kudlur, M. (2015). Order matters: Sequence to sequence for sets. arXiv preprint arXiv:1511.06391.
Preliminary for the following paper
- Vinyals, O., Blundell, C., Lillicrap, T., & Wierstra, D. (2016). Matching networks for one shot learning. In Advances in Neural Information Processing Systems (pp. 3630-3638).
Talks should be not longer than 35 minutes, and
be clear, interesting and informative, rather than a reprint of
the material. Select what parts of the material you want to
present, and what not, and then present the selected material
well (including definitions not given in the material: look them
up on the web or if that is not successful, ask the seminar
organizers). Often diagrams or figures are useful for a talk. on
the other hand, giving in the talk numbers of references that
are listed at the end is a no-no (a talk is an online process,
not meant to be read). For the same reasons you can also quickly
repeat earlier definitions or so if you suspect that the
audience may not remember it.
Talks will be assigned at the first seminar meeting on October
3, 15:15-17:00. Students are requested to have a quick
glance at the papers prior to this meeting in order to
determine their preferences. Note that the number of
participants for this seminar will be limited. Preference will
be given to students who
- are / will write a Master's Thesis at the institute
- are / will perform a Student's Project at the institute
- have registered early.
Participation in the seminar meetings is obligatory. We also request your
courtesy and attention for the seminar speaker: no
smartphones, laptops, etc during a talk. Furthermore your
active attention, questions, and discussion contributions are
After your talk (and
possibly some corrections) send pdf of your talk to Charlotte
Rumpf email@example.com, who will post it on
the seminar webpage.
||Topic / paper title
||Building Machines that learn and think like people
||LSTM: A search space odyssey
||REINFORCE + Reinforcement learning
||Asynchronous methods for deep reinforcement learning
||Neural architecture search with reinforcement learning
||Learning to reinforcement learn
||RL2: Fast Reinforcement Learning via Slow Reinforcement Learning
||Motor task variation induces structural learning
||The IGI-L2L software framework
||Learning to optimize
||Meta-SGD: Learning to Learn Quickly for Few Shot Learning
||One-shot learning with memory-augmented neural networks