# David Kappel

## contact

Institute for Theoretical Computer ScienceGraz University of Technology

Inffeldgasse 16b/1

A-8010 Graz, Austria

room: IC01044

+43 316 873 5847

david(at)igi.tugraz.at

# David Kappel

I am a PhD student at the Institute for Theoretical Computer Science in Graz, Austria, under the supervision of Prof. Wolfgang Maass.

- Development and simulation of models for neuron and synapse dynamics in spiking networks.
- Understanding the principles that allow recurrent networks to learn to represent complex spatiotemporal spike patterns.
- Understanding the role of noise for information processing and learning in recurrent networks.
- Models of cortical microcircuits and cortical function.
- Synaptic plasticity on long and short timescales, STDP.

Neuromorphic hardware tends to pose limits on the connectivity of deep networks that one can run on them. But also generic hardware and software implementations of deep learning run more efficiently on sparse networks. Several methods exist for pruning connections of a neural network after it was trained without connectivity constraints. We present an algorithm, DEEP R, that enables us to train directly a sparsely connected neural network. DEEP R automatically rewires the network during supervised training so that connections are there where they are most needed for the task, while its total number is all the time strictly bounded. We demonstrate that DEEP R can be used to train very sparse feedforward and recurrent neural networks on standard benchmark tasks with just a minor loss in performance. DEEP R is based on a rigorous theoretical foundation that views rewiring as stochastic sampling of network configurations from a posterior.

Experimental data suggest that neural circuits configure their synaptic connectivity for a given computational task. They also point to dopamine-gated stochastic spine dynamics as an important underlying mechanism, and they show that the stochastic component of synaptic plasticity is surprisingly strong. We propose a model that elucidates how task-dependent self-configuration of neural circuits can emerge through these mechanisms. The Fokker-Planck equation allows us to relate local stochastic processes at synapses to the stationary distribution of network configurations, and thereby to computational properties of the network. This framework suggests a new model for reward-gated network plasticity, where one replaces the common policy gradient paradigm by continuously ongoing stochastic policy search (sampling) from a posterior distribution of network configurations. This posterior integrates priors that encode for example previously attained knowledge and structural constraints. This model can explain the experimentally found capability of neural circuits to configure themselves for a given task, and to compensate automatically for changes in the network or task. We also show that experimental data on dopamine-modulated spine dynamics can be modeled within this theoretical framework, and that a strong stochastic component of synaptic plasticity is essential for its performance.

Synaptic plasticity is implemented and controlled through over thousand different types of molecules in the postsynaptic density and presynaptic boutons that assume a staggering array of different states through phosporylation and other mechanisms. One of the most prominent molecule in the postsynaptic density is CaMKII, that is described in molecular biology as a "memory molecule" that can integrate through auto-phosporylation Ca-influx signals on a relatively large time scale of dozens of seconds. The functional impact of this memory mechanism is largely unknown. We show that the experimental data on the specific role of CaMKII activation in dopamine-gated spine consolidation suggest a general functional role in speeding up reward-guided search for network configurations that maximize reward expectation. Our theoretical analysis shows that stochastic search could in principle even attain optimal network configurations by emulating one of the most well-known nonlinear optimization methods, simulated annealing. But this optimization is usually impeded by slowness of stochastic search at a given temperature. We propose that CaMKII contributes a momentum term that substantially speeds up this search. In particular, it allows the network to overcome saddle points of the fitness function. The resulting improved stochastic policy search can be understood on a more abstract level as Hamiltonian sampling, which is known to be one of the most efficient stochastic search methods.

A recurrent spiking neural network is proposed that implements planning as probabilistic inference for finite and infinite horizon tasks. The architecture splits this problem into two parts: The stochastic transient firing of the network embodies the dynamics of the planning task. With appropriate injected input this dynamics is shaped to generate high-reward state trajectories. A general class of reward-modulated plasticity rules for these afferent synapses is presented. The updates optimize the likelihood of getting a reward through a variant of an Expectation Maximization algorithm and learning is guaranteed to convergence to a local maximum. We find that the network dynamics are qualitatively similar to transient firing patterns during planning and foraging in the hippocampus of awake behaving rats. The model extends classical attractor models and provides a testable prediction on identifying modulating contextual information. In a real robot arm reaching and obstacle avoidance task the ability to represent multiple task solutions is investigated. The neural planning method with its local update rules provides the basis for future neuromorphic hardware implementations with promising potentials like large data processing abilities and early initiation of strategies to avoid dangerous situations in robot co-worker scenarios.

General results from statistical learning theory suggest to understand not only brain computations, but also brain plasticity as probabilistic inference. But a model for that has been missing. We propose that inherently stochastic features of synaptic plasticity and spine motility enable cortical networks of neurons to carry out probabilistic inference by sampling from a posterior distribution of network configurations. This model provides a viable alternative to existing models that propose convergence of parameters to maximum likelihood values. It explains how priors on weight distributions and connection probabilities can be merged optimally with learned experience, how cortical networks can generalize learned information so well to novel experiences, and how they can compensate continuously for unforeseen disturbances of the network. The resulting new theory of network plasticity explains from a functional perspective a number of experimental data on stochastic aspects of synaptic plasticity that previously appeared to be quite puzzling.

We reexamine in this article the conceptual and mathematical framework for understanding the organization of plasticity in spiking neural networks. We propose that inherent stochasticity enables synaptic plasticity to carry out probabilistic inference by sampling from a posterior distribution of synaptic parameters. This view provides a viable alternative to existing models that propose convergence of synaptic weights to maximum likelihood parameters. It explains how priors on weight distributions and connection probabilities can be merged optimally with learned experience. In simulations we show that our model for synaptic plasticity allows spiking neural networks to compensate continuously for unforeseen disturbances. Furthermore it provides a normative mathematical framework to better understand the permanent variability and rewiring observed in brain networks.

It has recently been shown that STDP installs in ensembles of pyramidal cells with lateral inhibition networks for Bayesian inference that are theoretically optimal for the case of stationary spike input patterns. We show here that if the experimentally found lateral excitatory connections between pyramidal cells are taken into account, theoretically optimal probabilistic models for the prediction of time-varying spike input patterns emerge through STDP. Furthermore a rigorous theoretical framework is established that explains the emergence of computational properties of this important motif of cortical microcircuits through learning. We show that the application of an idealized form of STDP approximates in this network motif a generic process for adapting a computational model to data: expectation-maximization. The versatility of computations carried out by these ensembles of pyramidal cells and the speed of the emergence of their computational properties through STDP is demonstrated through a variety of computer simulations. We show the ability of these networks to learn multiple input sequences through STDP and to reproduce the statistics of these inputs after learning.

NEVESIM is a software package for event-driven simulation of networks of spiking neurons with a fast simulation core in C++, and a scripting user interface in the Python programming language. It supports simulation of heterogeneous networks with different types of neurons and synapses, and can be easily extended by the user with new neuron and synapse types. To enable heterogeneous networks and extensibility, NEVESIM is designed to decouple the simulation logic of communicating events (spikes) between the neurons at a network level from the implementation of the internal dynamics of individual neurons. In this paper we will present the simulation framework of NEVESIM, its concepts and features, as well as some aspects of the object-oriented design approaches and simulation strategies that were utilized to efficiently implement the concepts and functionalities of the framework. We will also give an overview of the Python user interface, its basic commands and constructs, and also discuss the benefits of integrating NEVESIM with Python. One of the valuable capabilities of the simulator is to simulate exactly and efficiently networks of stochastic spiking neurons from the recently developed theoretical framework of neural sampling. This functionality was implemented as an extension on top of the basic NEVESIM framework. Altogether, the intended purpose of the NEVESIM framework is to provide a basis for further extensions that support simulation of various neural network models incorporating different neuron and synapse types that can potentially also use different simulation strategies.

Synaptic plasticity is implemented and controlled through over thousand different types of molecules in the postsynaptic density and presynaptic boutons that assume a staggering array of different states through phosporylation and other mechanisms. One of the most prominent molecule in the postsynaptic density is CaMKII, that is described in molecular biology as a "memory molecule" that can integrate through auto-phosporylation Ca-influx signals on a relatively large time scale of dozens of seconds. The functional impact of this memory mechanism is largely unknown. We show that the experimental data on the specific role of CaMKII activation in dopamine-gated spine consolidation suggest a general functional role in speeding up reward-guided search for network configurations that maximize reward expectation. Our theoretical analysis shows that stochastic search could in principle even attain optimal network configurations by emulating one of the most well-known nonlinear optimization methods, simulated annealing. But this optimization is usually impeded by slowness of stochastic search at a given temperature. We propose that CaMKII contributes a momentum term that substantially speeds up this search. In particular, it allows the network to overcome saddle points of the fitness function. The resulting improved stochastic policy search can be understood on a more abstract level as Hamiltonian sampling, which is known to be one of the most efficient stochastic search methods.

Experimental data show that synaptic connections, synaptic efficacy, and tuning curves of neurons are subject to permanently ongoing more-or-less stochastic changes, even in the absence of overt learning tasks. These data raise the question, how stable reward-based learning is possible. We show that these data can be understood from the perspective of a Bayesian learning theory, which posits that networks learn a posterior distribution of network configurations, rather than a single „optimal“ network configuration. In this theoretical framework, the experimentally observed ongoing changes in network configurations (including synaptic connections and synaptic weights) assume an important functional role: They enable the network to sample over time from a low-dimensional manifold of network configurations that have high probability under a posterior distribution of network states, thereby enabling Bayesian inference of network configurations. This posterior results from a prior that enforces structural rules of cortical networks (such as sparse connectivity, specific connection probabilities between specific types of neurons, heavy-tailed distributions of synaptic weights). The other factor of the posterior is a term that reflects the likelihood of a network configuration to lead to rewards (e.g., in the context of some motor learning task). We present a new mathematical framework (employing stochastic differential equations and Fokker-Plack equations) that creates links between local stochastic learning rules and the probability of network configurations under the posterior distribution. In particular, we show that previously proposed rules for reward-gated STDP can be derived from the more general principles of reward-based Bayesian inference ofnetwork configurations. This new model for reward-based network plasticity is not only consistent with experimental data on ongoing stochastic changes in network configurations (in fact: requires such ongoing stochastic changes), but also offers several functional advantages over previously considered learning frameworks based on convergence of the network configuration to an optimal one: --better generalization capability (since the prior works against overfitting of network configurations to a small set of training examples) --automatic compensation for network disturbances or changes in the environment (e.g., reward distributions) --a more more goal-oriented exploration of new network configurations in reward-based learning.

Substantial experimental evidence (e.g. on spine motility, fluctuation of PSD-95 proteins) suggests that synaptic connections and synaptic efficacies are continuously fluctuating, to some extent even in the absence of imposed learning (see e.g. [1]). These findings raise the question how stable network function can be acquired and maintained in spite of these ongoing stochastic changes of network parameters. We present a novel conceptual framework for the organization of plasticity in neuronal networks in the brain that is based on stochastic variations of standard synaptic plasticity rules (e.g., Hebbian or STDP). The stochastic component of the plasticity rules continuously drives network parameters θ within a low-dimensional manifold of parameter space. This framework does not only explain how stable network function can be maintained in spite of ongoing parameter fluctuations, it also exhibits interesting new functional properties that have been posited from the perspective of learning theory [2, 3]. The low- dimensional parameter manifold represents a region where a compromise between overriding structural rules (such as sparse connectivity and heavy-tailed weight distributions) and good functional circuit properties is reached. Both structural plasticity [1] and synaptic plasticity can be integrated into this theory of network plasticity. This provides a theoretically founded framework for relating experimental data on spine motility to experimentally observed network properties. Furthermore, this framework endows neuronal networks with an important experimentally observed capability: Automatic compensation for network perturbations [4]. Weshow that our alternative view can be turned into a rigorous learning model within the framework of probability theory. The low-dimensional parameter manifold can be characterized mathematically as the high probability regions of the posterior distribution of network parameters θ. More precisely, we propose that stochastic plasticity mechanisms enable brain networks to sample from this posterior, as opposed to the traditional view of learning as moving parameters to local optima θ* in parameter space. We demonstrate the advantages of this new theory in several computer simulations. These examples demonstrate how functional demands on network plasticity, such as incorporation of structural rules, automatic avoidance of overfitting, and inherent compensation capabilities, can be accomplished through stochastic plasticity rules. [1] Holtmat A & Svoboda K. Neuron 2006; 49 [2] MacKay DJ. Neural Comp 1992; 4 (3) [3] Pouget A et al. Nat Neurosci 2013; 16 (9) [4] Marder E. PNAS 2011; 108 (3)

General results from statistical learning theory suggest to understand not only brain computations, but also learning in biological neural systems as probabilistic inference. But a model for that has been missing. We propose that inherently stochastic features of synaptic plasticity and spine motility enable cortical networks of neurons to carry out probabilistic inference by sampling from a posterior distribution of network parameters. This model provides a viable alternative to existing models that propose convergence of parameters to maximum likelihood values. It explains how priors on weight distributions and connection probabilities can be merged optimally with learned experience, how cortical networks can generalize learned information to novel experiences, and how they can compensate continuously for unforeseen disturbances of the network. The resulting new theory of network plasticity explains from a functional perspective a number of experimental data on stochastic aspects of synaptic plasticity that previously appeared to be quite puzzling.

We present a model that explains how a Bayesian view of synaptic plasticity as probabilistic inference could be implemented by networks of spiking neurons in the brain through sampling. Such Bayesian perspective of brain plasticity has been proposed on general theoretical grounds. But it is open how this theoretically attractive model could be implemented in the brain. We propose that apart from stochasticity on the level of neuronal activity (neural sampling), also plasticity should be understood as stochastic sampling from a posterior distribution of parameters ("synaptic sampling"). This model is consistent with a number of puzzling experimental data, such as continuing spine motility in the adult cortex. In addition it provides desirable new functional properties of brain plasticity such as immediate compensation for perturbations and integration of new tasks. Furthermore it explains how salient priors such as sparse synaptic connectivity and log-normal distributions of weights could be integrated in a principled manner into synaptic plasticity rules.

Several recent publications have proposed that probabilistic inference provides a suitable theoretical framework for understanding salient aspects of biological motor control. We propose a model that implements stochastic motor control by solving a probabilistic inference problem with networks of spiking neurons through neural sampling. We demonstrate the viability of our model in a simulation of a standard arm reaching task. Our model provides a missing link between theoretical models of motor control and their biological implementation.

Efficient motor skill learning is of fundamental interest for both, understanding biological motor control as well as applications in robotics. Recently probabilistic inference has been suggested as possible model for transfer learning, decision making and motor control in humans. For robotics we propose to endow movement primitive representations with an intrinsic probabilistic planning system exploiting the power of stochastic optimal control methods. The parametrization of the primitive is a learned graphical model. This alternative representation competes with the state-of-the-art and complies with salient features of biological motor control, i.e. its modular organization in elementary movements, its characteristics of stochastic optimality under perturbations, and its efficiency in terms of learning.

In order to operate in a permanently changing world, brains need to extract salient hidden states from their environment, but it has remained an open question how networks of neurons in the brain could autonomously acquire this functionality. We show here that a generic network motif of cortical microcircuits, populations of pyramidal cells with lateral inhibition and excitatory interconnections, automatically acquires through STDP salient hidden state information. Hence networks of these ubiquitous network motifs could equip the brain with the capability to extract multiple hidden states of the environment, and generate predictions of the future on several temporal and spatial scales.

Institute for Theoretical Computer Science

Graz University of Technology

Inffeldgasse 16b/1

A-8010 Graz, Austria

room: IC01044

+43 316 873 5847

david(at)igi.tugraz.at