In [ ]:
[The second part of his workshop is partly adapted from Shiffman et al (2012)]
In this section, we will explore who evolution works, introducing different concepts as the impact of incremental evolution for generating radically different phenotypic patterns, the importance of crossover or genetic recombination to access to new zones of the genotypic space, and we will present some basic notions about genetic algorithms.
After a brief introduction about genetic algorithms, we introduce biomorphs as a simmple model to understand the genotype-phyenotype relations, and how incremental mutations create an exponential diversity of the fenotype of the individuals. But random diversity is just a part of the picture. Thus we will explore how crossover can accelerate the discovery of new genetic combinations far away from the initial genetic distribution. Finally, we will explore fitness-guided evolution in a series of examples to explore how selection might operate as a driving force for evolution.
The genetic algorithm is a method for solving both constrained and unconstrained optimization problems that is based on natural selection, the process that drives biological evolution. The genetic algorithm repeatedly modifies a population of individual solutions. At each step, the genetic algorithm selects individuals at random from the current population to be parents and uses them to produce the children for the next generation. Over successive generations, the population "evolves" toward something different.
The genetic algorithm uses three main types of rules at each step to create the next generation from the current population:
In order to get a better grasph of the rules of a genetic algorithm, we will explore its different elements separately using different models.
In developing his argument that natural selection can explain the complex adaptations of organisms, Dawkins' first concern is to illustrate the difference between the potential for the development of complexity as a result of pure randomness, as opposed to that of randomness coupled with cumulative selection.
The program displayed a two dimensional shape (a "biomorph") made up of straight black lines, the length, position, and angle of which were defined by a simple set of rules and instructions (analogous to a genome). Adding new lines (or removing them) based on these rules offered a discrete set of possible new shapes (mutations), which were displayed on screen so that the user could choose between them. The chosen mutation would then be the basis for another generation of biomorph mutants to be chosen from, and so on. Thus, the user, by selection, could steer the evolution of biomorphs. This process often produced images which were reminiscent of real organisms for instance beetles, bats, or trees. Dawkins speculated that the unnatural selection role played by the user in this program could be replaced by a more natural agent if, for example, colourful biomorphs could be selected by butterflies or other insects, via a touch sensitive display set up in a garden.
The appearance of each biomorph is determined by 9 genes: 3 that influence its width, 5 that influence its height, and 1 that influences its branching depth. These 9 genes can be used to generate more than 118 billion different biomorphs. Humans, for comparison, have about 20,000 to 25,000 protein-coding genes. The gradual change in the biomorphs' appearance in each generation serves as a simple model of biological evolution. Each biomorph is nearly identical to its parent, but after many generations the appearance of the biomorph can diverge wildly from the original. Biological evolution works in a similar way, though on a much longer time scale. For example, you closely resemble your parents, your parents closely resemble their parents, and so on, but if you go back tens of thousands of generations your distant ancestors would only bear a slight resemblance to you.
Because you select which child you want to survive, biomorphs are an example of artificial selection (similar to how humans have guided dog evolution over the past 15,000 – 30,000 years). In nature, however, evolution is based on natural selection: organisms that are best suited for their environment are the ones most likely to survive and pass on their genes.
In the example below, the genoma of the biomorph is expressed at the center picture. The surrounding pictures correspond to small incremental mutations, each of altering slightly one of the original genes. If a gene is at its limit value, then the corresponding picture is grey coloured. and it is not able to mutate it.
In [17]:
%%HTML
<iframe src="https://www.openprocessing.org/sketch/431860/embed/" width="611" height="611"></iframe>
Now you know about changing the genoma by mutations, let's try to do some genetic exchange.
Get your evolutive log table, select a biomorph goal form.
To start the exercise, you need a random genome (numbers from 0 to 8). For doing so, write down in order numbers from 0 to 8 in random positions. Write down your initial genetic code write in ‘Genome 1’ - ‘Generation 1’. As well, open the Processing skecth below and introduce your initial genome.
Now lets get promiscuous and start breeding. Look at your partners initial phenotype in their Processing sketches. Select someone you like. Try to select partners that share traits with your goal biomorph (sometimes their biomorph may not be similar to your objective but it may share some traits you are looking for). As in real life, mutual consent is necessary for sex, so you can only breed with someone if you both agree to do so.
Breeding works as follows. Write down your mate’s genome in ‘Genome 2’, and ask him or her for 3 crossover positions (numbers from 1 to 9) and write them down. Take the three genes from ‘Genome 2’ in these positions and write them down in ‘Genome 1’ for the next generation.
Repeat until you reach the desired phenotype.
In [12]:
%%HTML
<iframe src="https://www.openprocessing.org/sketch/432104/embed/" width="610" height="610"></iframe>
The “infinite monkey theorem” is stated as follows: A monkey hitting keys randomly on a typewriter will eventually type the complete works of Shakespeare (given an infinite amount of time). The problem with this theory is that the probability of said monkey actually typing Shakespeare is so low that even if that monkey started at the Big Bang, it’s unbelievably unlikely we’d even have Hamlet at this point.
Let’s consider the phrase “to be or not to be that is the question” (we’re simplifying it from the original “To be, or not to be: that is the question”). The phrase is 39 characters and the probability of the monkey hitting any given key is one in twenty-seven, so the probability the monkey typing the full phrase is (1/27)^39, i.e. one over 66·10^54. Even a computer simulation typing one million random phrases per second, that means typing for 10^15 years (Note that the age of the universe is estimated to be a mere 14·10^9 years).
We will apply a genetic algorithm to solve this task. First, we will create a population of randomly generated sentences of 39 characters. Then, we will compare the original sentence with the random sentences to compute their fitness, in this case the percentage of matched characters. The result is an evaluation mark for each random sentence.
Now, there are different ways to apply selection. For example, we could employ what is known as the elitist method and say, “Which two members of the population scored the highest? You two will make all the children for the next generation.” This is probably one of the easier methods to program; however, it flies in the face of the principle of variation: little variety may stunt the evolutionary process, as there could be entire families of variations not evaluated. A better solution is to use a probabilistic method, which we’ll call the “wheel of fortune” (also known as the “roulette wheel”). We will assign a probability of selection for each sentence proportional to its score.
Now that we have a strategy for picking parents, we need to figure out how to use reproduction to make the population’s next generation, keeping in mind the Darwinian principle of heredity—that children inherit properties from their parents. The standard approach with genetic algorithms is to pick two parents and create a child applying crossover. Crossover involves creating a child out of the genetic code of two parents.
Let’s assume we’ve picked two phrases: FORK and PLAY. We will pick up a cutting point and we will take the first part from one parent and the secong part from the other parent. For instance, a cutting point at the first-second character will produce the children FLAY and PORK, at the second-third character will be FOAY and PLRK, and so on.
Finally, we will mutate randomly the children to introduce additional variety throughout the evolutionary process itself.
The result of the previous explanation is this simulation. To analise the influence of the parameters of the genetic algorithm, open the code and change the variables of population (popmax) and mutation (mutationRate) at lines 48 and 49.
In [13]:
%%HTML
<iframe src="https://www.openprocessing.org/sketch/432793/embed/" width="640" height="360"></iframe>
In 2009, Jer Thorp released a genetic algorithms example on his blog entitled “Smart Rockets”. Jer points out that NASA uses evolutionary computing techniques to solve all sorts of problems, from satellite antenna design to rocket firing patterns. This inspired him to create a Flash demonstration of evolving rockets. Here is a description of the scenario.
A population of rockets launches from the bottom of the screen with the goal of hitting a target at the top of the screen (with obstacles blocking a straight line path). Here, each rocket will be equipped with a thruster able to fire in any direction with any strength for every frame of animation. Our strategy will be to pick some reasonable numbers and build out the system.
We stated the goal of a rocket reaching a target. In other words, the closer a rocket gets to the target, the higher the fitness. Fitness is inversely proportional to distance: the smaller the distance, the greater the fitness; the greater the distance, the smaller the fitness. We will initialize a DNA sequence as an array of random vectors cuanqtifying the strength and direction of the rocket at each moment of the simulation. This will be the genotype, and the fenotype will be the display of the rocket on the board. Therefore, the fitness function will be calculated on the fenotype, as there stands the information needed to calculate the distance to the target.
The process for building a pool of individuals and generating a new array of child rockets is exactly the same as what we did with the previous example of the infinite monkey theorem. There is one fairly significant change, however. The rockets need to live for a period of time before they can be evaluated; they need to be given a chance to make their attempt at reaching the target. So we will create a population of rockets, let the rockets live for N frames, evaluate them, and evolve the next generation with the selection, reproduction and mutation procedures.
Play the simulation and analyse the collective behaviour of the rocket swarm.
In [14]:
%%HTML
<iframe src="https://www.openprocessing.org/sketch/432794/embed/" width="640" height="360"></iframe>
Adding obstacles that the rockets must avoid will make the system more complex and demonstrate the power of the evolutionary algorithm more effectively. If the rocket hits an obstacle, it will crash and stop from updating its location. In addition, a rocket should be rewarded according to how quickly it reaches the target. The faster it reaches the target, the higher the fitness. The slower, the lower.
In [15]:
%%HTML
<iframe src="https://www.openprocessing.org/sketch/432795/embed/" width="640" height="360"></iframe>
In the real world, we don’t really have “survival of the fittest”; we have “survival of the survivors.” Things that happen to live longer, for whatever reason, have a greater chance of reproducing. We expect the population to grow and shrink according to how often the individuals die or are born.
We’ll create a creature called a "bloop," a circle that moves about the screen according to a Lévy flight movement. The creature will have a radius and a maximum speed. The bigger it is, the slower it moves; the smaller, the faster. Bloops dying is our replacement for a fitness function, the process of “selection.” If a bloop dies, it cannot be selected to be a parent, because it simply no longer exists. In each frame of animation, a bloop loses some health. When a bloop eats food, it increases its health points, and therefore extends its life. If health drops below zero, the bloop dies. We expect that our system would evolve bloops with an optimal ability to find and eat food.
The parameters of the bloops are undirectly shown on the display. The size is proportional to its maximum speed, the color varies according the type of Lévy flight, and the transparency of the bloop reflects its health. The genome has only one gene, the maximimun speed, so in this world the fenotypes are the result of the genotype plus other factors.
About selection and reproduction, we stated before that the longer a bloop lives, the more chances it has to reproduce. The length of life is the bloop’s fitness. At any given moment, a bloop has a chance of reproducing. Then the longer a bloop lives, the more likely it will make at least one child. Since we are making a child from a single parent, we will copy its DNA and mutate it to allow variation.
Play the simulation and observe the evolution of the system.
In [16]:
%%HTML
<iframe src="https://www.openprocessing.org/sketch/432796/embed/" width="640" height="360"></iframe>