next up previous
Next: Comparison of Learning Algorithms Up: MLB_Exercises_2008 Previous: Exercises

Genetic Algorithms [5+2* P]

Apply genetic algorithms to maximize the growth rate of the Escherichia coli bacterium. The growth rate is enhanced by LacZ proteins, which break down sugar lactose for use as an energy and carbon source. This LacZ proteins ($ Z$ ) are controlled by a gene regulatory network of the lactose (lac) system (see Fig. 1) consisting of an activator protein $ X$ = CRP (signal $ S_x$ =cAMP, a molecule produced within the cell upon glucose starvation) and a protein $ Y$ =LacI that is induced in the presence of $ S_y$ =lactose (which is assumed here to be constant, i.e 1). In the absence of the superior energy source glucose the lac system utilizes lactose to enhance the growth rate.

The benefit $ b$ in growth rate is proportional to the rate at which LacZ ($ Z$ ) breaks down its substrate, lactose, which is approximately proportional to the number of copies of the protein

$\displaystyle b(Z)= benefit \cdot Z $

where $ benefit$ denotes a constant factor. Note that the expression of LacZ ($ Z$ ) is only beneficial in the absence of glucose, i.e. when cAMP ($ =S_x$ ) is produced. On the other hand the expression of LacZ consumes energy and reduces the growth rate approximately by a constant factor

$\displaystyle c = cost.$

Therefore only for longer intervals of $ Z$ expression enough proteins accumulate to compensate for the cost of protein expression. The fitness function to be maximized is given by

$\displaystyle f = b-c.$

Figure: lac system of the Escherichia coli bacterium.

The dynamics of the proteins $ Y$ and $ Z$ is modeled by

$\displaystyle \frac{dY}{dt}=\beta I_{Y}- \alpha Y,     \frac{dZ}{dt}=\beta I_{Z}- \alpha Z$

with constant maximum production rate $ \beta=0.02$ and degradation rate $ \alpha=0.02$ . This results in a maximum expression rate $ Y_{max}=Z_{max}=\beta/\alpha=1$ and decay time constant of $ \alpha=0.02$ . The production rates $ I_{Y/Z}$ are given by
$\displaystyle I_{Y}$ $\displaystyle =$ $\displaystyle \Theta(k_{11} X + k_{12} Y + k_{13} Z - 1)$ (1)
$\displaystyle I_{Z}$ $\displaystyle =$ $\displaystyle \Theta(k_{21} X + k_{22} Y + k_{23} Z - 1)$ (2)

where $ \Theta$ denotes the Heaviside step function and matrix $ \bf k$ denotes activation coefficients. $ X$ is in this example is activated instantaneously by input $ S_x$ , i.e. $ X=S_x$ . This model is already implemented in MATLAB and only the activation coefficients $ \bf k$ have to be modified for the following analysis.

For this task the input $ S_x$ has the shape of a pulse with amplitude 1 and a duration of either 20 ms or 200 ms occurring with probability $ p_1$ and $ p_2$ ( $ p_1 + p_2 = 1$ ), respectively. The activation coefficients $ \bf k$ should be optimized with genetic algorithms to maximize the growth rate for different settings of $ benefit, cost, p_1,$ and $ p_2$ .

Download the example code and the Genetic algorithm toolbox.1

Complete the code for the evaluation of the fitness value in the file calc_fitness.m.

Optimize $ \bf k$ consisting of only positive elements ( $ k_{ij} \in [0 ,10]$ ) for $ cost=2$ , $ benefit=4$ , $ p_1=0.9$ and $ p_2=0.1$ using a population size of 5000 and 500 generations. Describe and explain the mechanisms of the solution. Average results over several runs of the genetic algorithm to average out outliers. Hand in a plot illustrating the dynamic of the gene regulatory network (GRN).

Repeat the optimization with parameters $ cost/benefit \in [0.1,2]$ , $ p_1 \in [0,1]$ . Describe different solutions (mechanisms of the GRN) found by the genetic algorithm. Hand in a plot of the gene regulatory network dynamics for each solution.

Hand in a two dimensional plot with axes $ cost/benefit$ and $ p_1$ which shows for the results obtain in d) which solution is selected in different parameter ranges.

[2* P] Repeat point d) with $ \bf k$ consisting of positive and negative elements ( $ k_{ij} \in [-10,10]$ ). Describe one of the new solutions that didn't emerge for $ k_{ij}>0$ for each of the parameter ranges found in e).

Present your results clearly, structured and legible. Document them in such a way that anybody can reproduce them effortless.

next up previous
Next: Comparison of Learning Algorithms Up: MLB_Exercises_2008 Previous: Exercises
Haeusler Stefan 2009-01-19