Evolution

[The second part of his workshop is partly adapted from Shiffman et al (2012)]

In this section, we will explore who evolution works, introducing different concepts as the impact of incremental evolution for generating radically different phenotypic patterns, the importance of crossover or genetic recombination to access to new zones of the genotypic space, and we will present some basic notions about genetic algorithms.

After a brief introduction about genetic algorithms, we introduce biomorphs as a simmple model to understand the genotype-phyenotype relations, and how incremental mutations create an exponential diversity of the fenotype of the individuals. But random diversity is just a part of the picture. Thus we will explore how crossover can accelerate the discovery of new genetic combinations far away from the initial genetic distribution. Finally, we will explore fitness-guided evolution in a series of examples to explore how selection might operate as a driving force for evolution.

1. A brief introduction to genetic algorithms

The genetic algorithm is a method for solving both constrained and unconstrained optimization problems that is based on natural selection, the process that drives biological evolution. The genetic algorithm repeatedly modifies a population of individual solutions. At each step, the genetic algorithm selects individuals at random from the current population to be parents and uses them to produce the children for the next generation. Over successive generations, the population "evolves" toward something different.

The genetic algorithm uses three main types of rules at each step to create the next generation from the current population:

  1. Selection rules select the individuals, called parents, that contribute to the population at the next generation.
  2. Crossover rules combine two parents to form children for the next generation.
  3. Mutation rules apply random changes to individual parents to form children.


In order to get a better grasph of the rules of a genetic algorithm, we will explore its different elements separately using different models.

2. Biomorphs

In developing his argument that natural selection can explain the complex adaptations of organisms, Dawkins' first concern is to illustrate the difference between the potential for the development of complexity as a result of pure randomness, as opposed to that of randomness coupled with cumulative selection.

The program displayed a two dimensional shape (a "biomorph") made up of straight black lines, the length, position, and angle of which were defined by a simple set of rules and instructions (analogous to a genome). Adding new lines (or removing them) based on these rules offered a discrete set of possible new shapes (mutations), which were displayed on screen so that the user could choose between them. The chosen mutation would then be the basis for another generation of biomorph mutants to be chosen from, and so on. Thus, the user, by selection, could steer the evolution of biomorphs. This process often produced images which were reminiscent of real organisms for instance beetles, bats, or trees. Dawkins speculated that the unnatural selection role played by the user in this program could be replaced by a more natural agent if, for example, colourful biomorphs could be selected by butterflies or other insects, via a touch sensitive display set up in a garden.

The appearance of each biomorph is determined by 9 genes: 3 that influence its width, 5 that influence its height, and 1 that influences its branching depth. These 9 genes can be used to generate more than 118 billion different biomorphs. Humans, for comparison, have about 20,000 to 25,000 protein-coding genes. The gradual change in the biomorphs' appearance in each generation serves as a simple model of biological evolution. Each biomorph is nearly identical to its parent, but after many generations the appearance of the biomorph can diverge wildly from the original. Biological evolution works in a similar way, though on a much longer time scale. For example, you closely resemble your parents, your parents closely resemble their parents, and so on, but if you go back tens of thousands of generations your distant ancestors would only bear a slight resemblance to you.

Because you select which child you want to survive, biomorphs are an example of artificial selection (similar to how humans have guided dog evolution over the past 15,000 – 30,000 years). In nature, however, evolution is based on natural selection: organisms that are best suited for their environment are the ones most likely to survive and pass on their genes. In future Emergent Mind projects, I plan to work on more projects that demonstrate natural selection (so stay tuned!).


2.1. Exercise

In the example below, the genoma of the biomorph is expressed at the center picture. The surrounding pictures correspond to small incremental mutations, each of altering slightly one of the original genes. If a gene is at its limit value, then the corresponding picture is grey coloured. and it is not able to mutate it.

  • Mutate the biomorph looking for different fenotypes shown in the picture above.
  • Find the mutations that produce the greatest variations.

In [1]:
%%HTML
<iframe src="https://www.openprocessing.org/sketch/431860/embed/" width="611" height="611"></iframe>


2.2. Exercise

Now you know about changing the genoma by mutations, let's try to do some genetic interchange.

Below these lines, you have a genoma to start with. As well, you have received an image of the fenotype you want to get. To reach that goal, you can exchange you genoma with your partners. Find a partner with a fenotype you are interested in and modidy half of your genotype with him/her. Analyse the results and continue exchanging your genoma until your goal is acahieved.


In [4]:
from random import randint
import numpy as np
from IPython.display import display

%matplotlib inline

genome = np.random.randint(0,9,9)
if genome[8] == 0:
    genome[8] = 1
display(genome)


[8 0 8 0 3 2 1 5 1]
array([8, 0, 8, 0, 3, 2, 1, 5, 1])

In [23]:
%%HTML
<iframe src="https://www.openprocessing.org/sketch/432104/embed/" width="610" height="610"></iframe>


3. Shakespeare

The “infinite monkey theorem” is stated as follows: A monkey hitting keys randomly on a typewriter will eventually type the complete works of Shakespeare (given an infinite amount of time). The problem with this theory is that the probability of said monkey actually typing Shakespeare is so low that even if that monkey started at the Big Bang, it’s unbelievably unlikely we’d even have Hamlet at this point.

Let’s consider the phrase “to be or not to be that is the question” (we’re simplifying it from the original “To be, or not to be: that is the question”). The phrase is 39 characters and the probability of the monkey hitting any given key is one in twenty-seven, so the probability the monkey typing the full phrase is (1/27)^39, i.e. one over 66·10^54. Even a computer simulation typing one million random phrases per second, that means typing for 10^15 years (Note that the age of the universe is estimated to be a mere 14·10^9 years).

We will apply a genetic algorithm to solve this task. First, we will create a population of randomly generated sentences of 39 characters. Then, we will compare the original sentence with the random sentences to compute their fitness, in this case the percentage of matched characters. The result is an evaluation mark for each random sentence.

Now, there are different ways to apply selection. For example, we could employ what is known as the elitist method and say, “Which two members of the population scored the highest? You two will make all the children for the next generation.” This is probably one of the easier methods to program; however, it flies in the face of the principle of variation: little variety may stunt the evolutionary process, as there could be entire families of variations not evaluated. A better solution is to use a probabilistic method, which we’ll call the “wheel of fortune” (also known as the “roulette wheel”). We will assign a probability of selection for each sentence proportional to its score.

Now that we have a strategy for picking parents, we need to figure out how to use reproduction to make the population’s next generation, keeping in mind the Darwinian principle of heredity—that children inherit properties from their parents. The standard approach with genetic algorithms is to pick two parents and create a child applying crossover. Crossover involves creating a child out of the genetic code of two parents.

Let’s assume we’ve picked two phrases: FORK and PLAY. We will pick up a cutting point and we will take the first part from one parent and the secong part from the other parent. For instance, a cutting point at the first-second character will produce the children FLAY and PORK, at the second-third character will be FOAY and PLRK, and so on.

Finally, we will mutate randomly the children to introduce additional variety throughout the evolutionary process itself.

3.1. Exercise

The result of the previous explanation is this simulation. To analise the influence of the parameters of the genetic algorithm, open the code and change the variables of population (popmax) and mutation (mutationRate) at lines 48 and 49.

  • With a fixed mutation rate of 1 %, try a population from 250 to 50.000 individuals.
  • With a fixed population of 1000 individuals, try mutation rates from 0 % to 10 %.
  • For each simulation, annotate the number of generations and total time until the phrase is solved. What are the conclusions?
  • As well, it is possible to change the fitness function. On the tab "DNA", modify the line 47 to other equation, and calculate the probability of a certain sentence to be selected according the original and the new formula. What are the implications of changing this formula? Note: For instance, you can implement a new fitness = log( 1e-10 + (float)score / (float)target.length() );

In [24]:
%%HTML
<iframe src="https://www.openprocessing.org/sketch/432793/embed/" width="640" height="360"></iframe>


4. Smart Rockets

In 2009, Jer Thorp released a genetic algorithms example on his blog entitled “Smart Rockets”. Jer points out that NASA uses evolutionary computing techniques to solve all sorts of problems, from satellite antenna design to rocket firing patterns. This inspired him to create a Flash demonstration of evolving rockets. Here is a description of the scenario.

A population of rockets launches from the bottom of the screen with the goal of hitting a target at the top of the screen (with obstacles blocking a straight line path). Here, each rocket will be equipped with a thruster able to fire in any direction with any strength for every frame of animation. Our strategy will be to pick some reasonable numbers and build out the system.

We stated the goal of a rocket reaching a target. In other words, the closer a rocket gets to the target, the higher the fitness. Fitness is inversely proportional to distance: the smaller the distance, the greater the fitness; the greater the distance, the smaller the fitness. We will initialize a DNA sequence as an array of random vectors cuanqtifying the strength and direction of the rocket at each moment of the simulation. This will be the genotype, and the fenotype will be the display of the rocket on the board. Therefore, the fitness function will be calculated on the fenotype, as there stands the information needed to calculate the distance to the target.

The process for building a pool of individuals and generating a new array of child rockets is exactly the same as what we did with the previous example of the infinite monkey theorem. There is one fairly significant change, however. The rockets need to live for a period of time before they can be evaluated; they need to be given a chance to make their attempt at reaching the target. So we will create a population of rockets, let the rockets live for N frames, evaluate them, and evolve the next generation with the selection, reproduction and mutation procedures.

4.1. Exercise

Play the simulation and analyse the collective behaviour of the rocket swarm.

  • In the code, localize the files (tags) of the genotype and the fenotype. Which is which?
  • Change the evolution parameters (population, mutation rate) and adjust them to improve the evolution.
  • The evolution ends after a number of cycles is reached. Which other stop conditions would be feasible? Could all the rockets reach the target?

In [25]:
%%HTML
<iframe src="https://www.openprocessing.org/sketch/432794/embed/" width="640" height="360"></iframe>


4.2. Exercise

Adding obstacles that the rockets must avoid will make the system more complex and demonstrate the power of the evolutionary algorithm more effectively. If the rocket hits an obstacle, it will crash and stop from updating its location. In addition, a rocket should be rewarded according to how quickly it reaches the target. The faster it reaches the target, the higher the fitness. The slower, the lower.

  • Modify the obstacle with a different length or shape.
  • Add gravity to the obstacle, i.e. the rocket acceleration is shifted towards the object.
  • Locate the fitness function and try other equations. For instance, score the minimum distance of the rocket path to the coordinates (150,200) to promote the rocket passing nearby.

In [26]:
%%HTML
<iframe src="https://www.openprocessing.org/sketch/432795/embed/" width="640" height="360"></iframe>


5. Evolution Eco-system

In the real world, we don’t really have “survival of the fittest”; we have “survival of the survivors.” Things that happen to live longer, for whatever reason, have a greater chance of reproducing. We expect the population to grow and shrink according to how often the individuals die or are born.

We’ll create a creature called a "bloop," a circle that moves about the screen according to a Lévy flight movement. The creature will have a radius and a maximum speed. The bigger it is, the slower it moves; the smaller, the faster. Bloops dying is our replacement for a fitness function, the process of “selection.” If a bloop dies, it cannot be selected to be a parent, because it simply no longer exists. In each frame of animation, a bloop loses some health. When a bloop eats food, it increases its health points, and therefore extends its life. If health drops below zero, the bloop dies. We expect that our system would evolve bloops with an optimal ability to find and eat food.

The parameters of the bloops are undirectly shown on the display. The size is proportional to its maximum speed, the color varies according the type of Lévy flight, and the transparency of the bloop reflects its health. The genome has only one gene, the maximimun speed, so in this world the fenotypes are the result of the genotype plus other factors.

About selection and reproduction, we stated before that the longer a bloop lives, the more chances it has to reproduce. The length of life is the bloop’s fitness. At any given moment, a bloop has a chance of reproducing. Then the longer a bloop lives, the more likely it will make at least one child. Since we are making a child from a single parent, we will copy its DNA and mutate it to allow variation.

5.1. Exercise

Play the simulation and observe the evolution of the system.

  • Is the maximum speed related to the survival of the bloop? Why?
  • Is the type of Lévi flight related to the survival of the bloop? Why?
  • One of the greatest challenges in ecosystem simulations is achieving a nice balance. You will likely find that most of your attempts result in either mass overpopulation (followed by mass extinction) or simply mass extinction straight away. What techniques can you employ to achieve balance?

In [28]:
%%HTML
<iframe src="https://www.openprocessing.org/sketch/432796/embed/" width="640" height="360"></iframe>


References