Basic Programming Using Python: Growing a Program

Objectives

FIXME

Lesson

We've introduced a lot of concepts in the past few lessons, so it's time to bring them together and show how we use them to create programs that actually do some useful science. We'll start by creating a simple simulation program, then explore data analysis in the next lesson.

The physical phenomenon we're going to simulate is called diffusion limited aggregation (DLA), and is responsible for beautiful fractal shapes like the one shown below:

The process begins with a single fixed seed surrounded by freely-movied particles. These may be atoms in solution or flakes of soot in a chimney; what matters is that when one hits the seed, it sticks. Other particles then stick to the growing cluster to form a distinctive branching shape.

Our simulation will be very simple. We will start by creating a two-dimensional grid of cells, all but one of which are empty. We will then "drop" one particle at a time into the grid. At each time step, the particle will fall one level in the grid and possibly move either left or right as it does so. If a particle touches the growing aggregate, it sticks, and the simulation stops when the aggregate reaches the top of the grid:

FIXME: diagram

In order to build this, we need a way to represent a two-dimensional grid, and a way to generate random numbers. We can use ImageGrid for the former (though we'll come back later and use something else), so let's explore the latter before we dive into writing code.

Random Numbers

Let's take a look at a small program that does nothing but generate a few random numbers:


In [170]:
import random
for i in range(5):
    print random.randint(1, 10)


2
10
8
3
5

The first step is to import the standard Python random number library called (unsurprisingly) random. We can then call random.randint to produce as many "random" numbers as we want. The quotes around the word "random" are important: the values produced by this library (and its equivalents in other languages) are only pseudo-random, and it's important for scientists to understand the distinction in order to use them properly.

To see why, let's try generating some random values ourselves:


In [171]:
base = 17
value = 4
for i in range(5):
    print value
    value = (3 * value + 5) % base


4
0
5
3
14

The base determines how many integers we get before the sequence starts to repeat itself. The sequence eventually does have to repeat: computers can only represent a finite range of numbers, so sooner or later, it's going to go back over its tracks. Once it does, values will appear in the same order as they did the first time:


In [172]:
base = 17
value = 4
for i in range(40):
    print value,
    value = (3 * value + 5) % base


4 0 5 3 14 13 10 1 8 12 7 9 15 16 2 11 4 0 5 3 14 13 10 1 8 12 7 9 15 16 2 11 4 0 5 3 14 13 10 1

The seed controls where the sequence starts. If we choose a different seed, such as 9, the sequence starts in a different place, but produces the same values in the same order:


In [173]:
value = 9
for i in range(10):
    print value,
    value = (3 * value + 5) % base


9 15 16 2 11 4 0 5 3 14

But there's a problem, and that problem is the reason scientists should never, ever write their own random number generators. You may not have noticed, but the number 6 doesn't appear in the sequences above. That's bad news: having gaps in our supposedly-random distribution would almost certainly introduce bias in our simulations and statistics. What's worse is what happens if we choose 6 as our seed:


In [174]:
value = 6
for i in range(10):
    print value,
    value = (3 * value + 5) % base


6 6 6 6 6 6 6 6 6 6

Those "random" values would definitely introduce bias into our calculations. Can we be sure that random.randint won't do this, and that something subtler won't go wrong?

The answer to both questions is "yes, but it's hard". Modern pseudo-random number generators rely on some sophisticated mathematics to ensure that the values they produce have all the properties of real randomness. Scientists should always use them instead of building their own.

A Function at a Time

Now that we know how to generate random numbers, we can start to grow our DLA program. The first step is to write a main driver to control the whole simulation:


In [175]:
from ipythonblocks import ImageGrid, colors

def main(size):
    '''Simulate diffusion-limited aggregation.'''
    grid = create_grid(size)
    filling = True
    while filling:
        fill_next(grid)
        if finished(grid):
            filling = False
    grid.show()

This is about as generic as it can be: we create a grid, then fill in one cell at a time until the simulation is supposed to end, at which point we display the aggregate. Let's try running it:


In [176]:
main()


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-176-58ca95c5b364> in <module>()
----> 1 main()

TypeError: main() takes exactly 1 argument (0 given)

Excellent: it fails, just as we expect it too. If it had succeeded in running, we'd know that we had somehow accidentally imported or defined a create_grid function.

We now have three functions to write: create_grid, fill_next, and finished. Hearkening back to our previous lesson, though, what we really have is three sets of tests to write to define more clearly what we want these to do. Since fill_next and finished need a grid to work on, let's write create_grid's tests first:


In [177]:
def test_create_3x3():
    grid = create_grid(3)
    assert grid.width == 3
    assert grid.height == 3

Let's try running that:


In [178]:
import ears
ears.run()


........
...
11 pass, 0 fail, 0 error

Again, this is good: we haven't written create_grid, so our test shouldn't pass yet. Let's write the function:


In [179]:
def create_grid(size):
    '''Create a size X size grid for use in DLA.'''
    grid = ImageGrid(size, size, fill=colors['White'])
    return grid

and re-run our tests:


In [180]:
ears.run()


.f...f..
f..
8 pass, 3 fail, 0 error
----------------------------------------
fail: test_not_contact_with_seed
Traceback (most recent call last):
  File "ears.py", line 43, in run
    test()
  File "<ipython-input-101-0ddfb9a18aec>", line 5, in test_not_contact_with_seed
    assert not in_contact(create_grid(3), 1, 2)
AssertionError

----------------------------------------
fail: test_finished_at_start
Traceback (most recent call last):
  File "ears.py", line 43, in run
    test()
  File "<ipython-input-94-ace7e5274521>", line 3, in test_finished_at_start
    assert not finished(grid)
AssertionError

----------------------------------------
fail: test_3x3_filled
Traceback (most recent call last):
  File "ears.py", line 43, in run
    test()
  File "<ipython-input-159-aaa36534628c>", line 4, in test_3x3_filled
    assert not filled(grid, 0, 0)
AssertionError

We're not done yet, though. In order for things to start aggregating, we need a seed cell in our grid; in order to create one, we need to decide where it goes, which is a question of science, not programming. Since we're introducing particles at the top of the chimney, and they can only ever fall, let's put our seed in the center of the grid's bottom row. That leads to the following test:


In [181]:
def test_3x3_filled():
    grid = create_grid(3)
    assert grid[0, 0].rgb == colors['White']
    assert grid[0, 1].rgb == colors['White']
    assert grid[0, 2].rgb == colors['White']
    assert grid[1, 0].rgb == colors['Black'] # only fill the center cell of the bottom row
    assert grid[1, 1].rgb == colors['White']
    assert grid[1, 2].rgb == colors['White']
    assert grid[2, 0].rgb == colors['White']
    assert grid[2, 1].rgb == colors['White']
    assert grid[2, 2].rgb == colors['White']

ears.run()


.f...f..f..
8 pass, 3 fail, 0 error
----------------------------------------
fail: test_not_contact_with_seed
Traceback (most recent call last):
  File "ears.py", line 43, in run
    test()
  File "<ipython-input-101-0ddfb9a18aec>", line 5, in test_not_contact_with_seed
    assert not in_contact(create_grid(3), 1, 2)
AssertionError

----------------------------------------
fail: test_finished_at_start
Traceback (most recent call last):
  File "ears.py", line 43, in run
    test()
  File "<ipython-input-94-ace7e5274521>", line 3, in test_finished_at_start
    assert not finished(grid)
AssertionError

----------------------------------------
fail: test_3x3_filled
Traceback (most recent call last):
  File "ears.py", line 43, in run
    test()
  File "<ipython-input-181-a6855904c6d5>", line 6, in test_3x3_filled
    assert grid[1, 0].rgb == colors['Black'] # only fill the center cell of the bottom row
AssertionError

and then rewrite create_grid to pass our test:


In [182]:
def create_grid(size):
    '''Create a size X size grid and fill in the middle bottom cell.'''
    grid = ImageGrid(size, size, fill=colors['White'])
    middle = size/2
    grid[middle, 0] = colors['Black']
    return grid

ears.run()


.f...f.....
9 pass, 2 fail, 0 error
----------------------------------------
fail: test_not_contact_with_seed
Traceback (most recent call last):
  File "ears.py", line 43, in run
    test()
  File "<ipython-input-101-0ddfb9a18aec>", line 5, in test_not_contact_with_seed
    assert not in_contact(create_grid(3), 1, 2)
AssertionError

----------------------------------------
fail: test_finished_at_start
Traceback (most recent call last):
  File "ears.py", line 43, in run
    test()
  File "<ipython-input-94-ace7e5274521>", line 3, in test_finished_at_start
    assert not finished(grid)
AssertionError

This seems to be working, so we'll move on. We need to write either fill_next or finished next; the former is called before the latter, but the latter is probably simpler to build and test, so we'll do it first. Again, we start by writing some tests:


In [183]:
def test_finished_at_start():
    grid = create_grid(3)
    assert not finished(grid)

def test_finished_at_top():
    grid = create_grid(3)
    grid[0, 2] = colors['Black']
    assert finished(grid)

ears.run()


.f...f.....
9 pass, 2 fail, 0 error
----------------------------------------
fail: test_not_contact_with_seed
Traceback (most recent call last):
  File "ears.py", line 43, in run
    test()
  File "<ipython-input-101-0ddfb9a18aec>", line 5, in test_not_contact_with_seed
    assert not in_contact(create_grid(3), 1, 2)
AssertionError

----------------------------------------
fail: test_finished_at_start
Traceback (most recent call last):
  File "ears.py", line 43, in run
    test()
  File "<ipython-input-183-ace7e5274521>", line 3, in test_finished_at_start
    assert not finished(grid)
AssertionError

Once again, these failures are good: they mean that we aren't accidentally using a finished function that we didn't know we had. Let's try writing ours:


In [184]:
def finished(grid):
    '''Are we finished simulating DLA on this grid?'''
    for x in range(grid.width):
        if grid[x, grid.height-1].rgb == colors['Black']:
            return True
    return False

This function uses a very common design pattern: as soon as the loop finds an answer, we return it, and if we finish the loop without having returned an answer, we return a default. Let's re-run our tests:


In [185]:
ears.run()


.f.........
10 pass, 1 fail, 0 error
----------------------------------------
fail: test_not_contact_with_seed
Traceback (most recent call last):
  File "ears.py", line 43, in run
    test()
  File "<ipython-input-101-0ddfb9a18aec>", line 5, in test_not_contact_with_seed
    assert not in_contact(create_grid(3), 1, 2)
AssertionError

Once again, this seems to be working, so we'll move on to our last function fill_next. This needs to add a particle and then move it until it either sticks or falls off the map. Before we can write it, though, we need to decide whether "falling off the map" includes falling off the sides. Does a particle that tries to move beyond the left or right edge of the map disappear, stick, bounce, or just keep moving? To keep things simple, we'll decide that if it moves off the grid in any direction, it's gone forever. That decision gives us this function:


In [186]:
def fill_next(grid):
    x = random.randint(0, grid.width-1) # random starting point in X
    y = grid.height - 1                 # start at the top of the map
    while True:

        # Stuck?
        if in_contact(grid, x, y):
            grid[x, y] = colors['Black']
            return

        # Move.
        y -= 1 # always fall
        x += random.randint(-1, 1) # move left, down, or right with equal probability
        
        # Off the grid?
        if not on_grid(grid, x, y):
            return

Once again, we have written this code as if we had the functions we need, and must now go back and write them (and their tests).


In [187]:
def test_bottom_corner_is_within_grid():
    assert on_grid(create_grid(3), 0, 0)

def test_center_is_within_grid():
    assert on_grid(create_grid(3), 1, 1)

def test_off_left_not_on_grid():
    assert not on_grid(create_grid(3), -1, 1)

def test_off_right_not_on_grid():
    assert not on_grid(create_grid(3), 3, 2)

def test_off_bottom_not_on_grid():
    assert not on_grid(create_grid(3), 1, -1)

def on_grid(grid, x, y):
    '''Check whether a cell is within the grid's boundaries.'''
    return (0 <= x < grid.width) and (0 <= y < grid.height)

ears.run()


.f.........
10 pass, 1 fail, 0 error
----------------------------------------
fail: test_not_contact_with_seed
Traceback (most recent call last):
  File "ears.py", line 43, in run
    test()
  File "<ipython-input-101-0ddfb9a18aec>", line 5, in test_not_contact_with_seed
    assert not in_contact(create_grid(3), 1, 2)
AssertionError


In [188]:
def test_contact_with_seed():
    assert in_contact(create_grid(3), 0, 0)

def test_not_contact_with_seed():
    assert not in_contact(create_grid(3), 1, 2)

def in_contact(grid, x, y):
    '''Is the cell at (x, y) in contact with a filled region?'''
    if on_grid(grid, x-1, y) and grid[x-1, y] == colors['Black']:
        return True
    if on_grid(grid, x+1, y) and grid[x+1, y] == colors['Black']:
        return True
    if on_grid(grid, x, y-1) and grid[x, y-1] == colors['Black']:
        return True
    if on_grid(grid, x, y+1) and grid[x, y+1] == colors['Black']:
        return True
    return False

ears.run()


....f......
10 pass, 1 fail, 0 error
----------------------------------------
fail: test_contact_with_seed
Traceback (most recent call last):
  File "ears.py", line 43, in run
    test()
  File "<ipython-input-188-0ddfb9a18aec>", line 2, in test_contact_with_seed
    assert in_contact(create_grid(3), 0, 0)
AssertionError

Whoops: if we create a 3×3 grid and fill in the center bottom cell, then check whether the lower-right corner is adjacent to the filled region, the result ought to be True. Our mistake is to compare the cell's value directly to a color; what we should instead do is compare grid[x,y].rgb to the color. Let's rewrite our function and re-run our tests:


In [189]:
def in_contact(grid, x, y):
    '''Is the cell at (x, y) in contact with a filled region?'''
    if on_grid(grid, x-1, y) and grid[x-1, y].rgb == colors['Black']:
        return True
    if on_grid(grid, x+1, y) and grid[x+1, y].rgb == colors['Black']:
        return True
    if on_grid(grid, x, y-1) and grid[x, y-1].rgb == colors['Black']:
        return True
    if on_grid(grid, x, y+1) and grid[x, y+1].rgb == colors['Black']:
        return True
    return False

ears.run()


...........
11 pass, 0 fail, 0 error

That works, but it's still not pretty: we have four pairs of lines that are almost duplicated, and if we need to make changes, the odds are that we'll change some but forget to change others. We'll come back and fix this in a moment; for now, let's see if our entire program now runs.


In [190]:
main(3)


Good and Bad Designs

Is that correct? It's hard to tell without being able to see how big the grid is, so let's change the color of our background cells to something that will show up against a white browser page:


In [191]:
def create_grid(size):
    '''Create a size X size grid and fill in the middle bottom cell.'''
    grid = ImageGrid(size, size, fill=colors['Gray'])
    middle = size/2
    grid[middle, 0] = colors['Black']
    return grid

ears.run()


........f..
10 pass, 1 fail, 0 error
----------------------------------------
fail: test_3x3_filled
Traceback (most recent call last):
  File "ears.py", line 43, in run
    test()
  File "<ipython-input-181-a6855904c6d5>", line 3, in test_3x3_filled
    assert grid[0, 0].rgb == colors['White']
AssertionError

Whoops again: our test for an empty cell is checking that the cell is white, so when we change the fill-in color, the test fails. That's easy enough to fix—we could just update the test—but look at what happens if we decide to blue to color filled-in cells instead of black:


In [192]:
def create_grid(size):
    '''Create a size X size grid and fill in the middle bottom cell.'''
    grid = ImageGrid(size, size, fill=colors['Gray'])
    middle = size/2
    grid[middle, 0] = colors['Blue']
    return grid

ears.run()


....f...f..
9 pass, 2 fail, 0 error
----------------------------------------
fail: test_contact_with_seed
Traceback (most recent call last):
  File "ears.py", line 43, in run
    test()
  File "<ipython-input-188-0ddfb9a18aec>", line 2, in test_contact_with_seed
    assert in_contact(create_grid(3), 0, 0)
AssertionError

----------------------------------------
fail: test_3x3_filled
Traceback (most recent call last):
  File "ears.py", line 43, in run
    test()
  File "<ipython-input-181-a6855904c6d5>", line 3, in test_3x3_filled
    assert grid[0, 0].rgb == colors['White']
AssertionError

Our real mistake has been to use black, white, blue, and gray to mean two different things: whether a cell is logically filled or not, and what color should be displayed to show the cell. Let's change the code to used FILLED and EMPTY instead of absolute color values:


In [193]:
EMPTY = colors['Gray']
FILLED = colors['Blue']

def main(size):
    '''Simulate diffusion-limited aggregation.'''
    grid = create_grid(size)
    filling = True
    while filling:
        fill_next(grid)
        if finished(grid):
            filling = False
    grid.show()

def create_grid(size):
    '''Create a size X size grid and fill in the middle bottom cell.'''
    grid = ImageGrid(size, size, fill=EMPTY)
    middle = size/2
    grid[middle, 0] = FILLED
    return grid

def finished(grid):
    '''Are we finished simulating DLA on this grid?'''
    for x in range(grid.width):
        if grid[x, grid.height-1].rgb == FILLED:
            return True
    return False

def fill_next(grid):
    x = random.randint(0, grid.width-1)
    y = grid.height - 1
    while True:
        if in_contact(grid, x, y):
            grid[x, y] = FILLED
            return

        y -= 1
        x += random.randint(-1, 1)
        
        if not on_grid(grid, x, y):
            return

def in_contact(grid, x, y):
    '''Is the cell at (x, y) in contact with a filled region?'''
    if on_grid(grid, x-1, y) and grid[x-1, y].rgb == FILLED:
        return True
    if on_grid(grid, x+1, y) and grid[x+1, y].rgb == FILLED:
        return True
    if on_grid(grid, x, y-1) and grid[x, y-1].rgb == FILLED:
        return True
    if on_grid(grid, x, y+1) and grid[x, y+1].rgb == FILLED:
        return True
    return False

def test_finished_at_top():
    grid = create_grid(3)
    grid[0, 2] = FILLED
    assert finished(grid)

def test_3x3_filled():
    grid = create_grid(3)
    assert grid[0, 0].rgb == EMPTY
    assert grid[0, 1].rgb == EMPTY
    assert grid[0, 2].rgb == EMPTY
    assert grid[1, 0].rgb == FILLED
    assert grid[1, 1].rgb == EMPTY
    assert grid[1, 2].rgb == EMPTY
    assert grid[2, 0].rgb == EMPTY
    assert grid[2, 1].rgb == EMPTY
    assert grid[2, 2].rgb == EMPTY

ears.run()


...........
11 pass, 0 fail, 0 error

and then run our program again:


In [194]:
main(3)


Is that right? Let's try another run:


In [195]:
main(3)


and another:


In [196]:
main(3)


Those are all legal fractals, albeit very small ones—does that mean our program is working? How many tests would we have to do to convince ourselves?

Before we explore those questions, though, we should think about the changes we just made. When we wanted to change the colors used to display cells, we had to rewrite every single one of our functions and two of our tests. That tells us that we had a bad design, because small logical changes to programs should only require small physical changes to code. We think our new design is better because we can change the colors again simply by redefining a couple of variables:


In [197]:
EMPTY = colors['Silver']
FILLED = colors['Red']

ears.run()

main(19)


...........
11 pass, 0 fail, 0 error

That's much better: a simple change was simple to make and didn't break any of our existing tests.

Changing Colors

Let's try one more test of our design. Instead of coloring all the cells the same color, let's change the color gradually as they're filled in so that we can see how our fractal grows. We'll color the seed bright red (i.e., (255,0,0)), then color all the other cells from black back up to bright red, incrementing the red value of the color triple by one each time.

This change means that we will no longer have a single unique FILLED value, which in turn means that we should use cell[x,y].rgb != EMPTY instead of cell[x,y].rgb == FILLED to check cells' states.


In [198]:
EMPTY = colors['Gray']

def create_grid(size):
    '''Create a size X size grid and fill in the middle bottom cell.'''
    grid = ImageGrid(size, size, fill=EMPTY)
    middle = size/2
    grid[middle, 0] = make_color(True) # CHANGED
    return grid

def finished(grid):
    '''Are we finished simulating DLA on this grid?'''
    for x in range(grid.width):
        if grid[x, grid.height-1].rgb != EMPTY: # CHANGED
            return True
    return False

def fill_next(grid):
    x = random.randint(0, grid.width-1)
    y = grid.height - 1
    while True:
        if in_contact(grid, x, y):
            grid[x, y] = make_color(False) # CHANGED
            return

        y -= 1
        x += random.randint(-1, 1)
        
        if not on_grid(grid, x, y):
            return

def in_contact(grid, x, y):
    '''Is the cell at (x, y) in contact with a filled region?'''
    if on_grid(grid, x-1, y) and grid[x-1, y].rgb != EMPTY: # CHANGED
        return True
    if on_grid(grid, x+1, y) and grid[x+1, y].rgb != EMPTY: # CHANGED
        return True
    if on_grid(grid, x, y-1) and grid[x, y-1].rgb != EMPTY: # CHANGED
        return True
    if on_grid(grid, x, y+1) and grid[x, y+1].rgb != EMPTY: # CHANGED
        return True
    return False

Hm... the two places where we were changing cell colors now use a function called make_color, which we'll write in a second. The other five changes are all checking whether a cell is filled or empty; they're mostly duplicates of one another, so let's collapse them into yet another function:


In [199]:
def filled(grid, x, y):
    '''Is grid[x,y] filled or not?'''
    return on_grid(grid, x, y) and (grid[x, y].rgb != EMPTY)

def finished(grid):
    '''Are we finished simulating DLA on this grid?'''
    for x in range(grid.width):
        if filled(grid, x, grid.height-1):
            return True
    return False

def in_contact(grid, x, y):
    '''Is the cell at (x, y) in contact with a filled region?'''
    return filled(grid, x-1, y) or filled(grid, x+1, y) or \
           filled(grid, x, y-1) or filled(grid, x, y+1)

Now we can define make_color:


In [200]:
Current = 255

def make_color(reset):
    global Current
    if reset:
        Current = 255
    result = (Current, 0, 0)
    Current = (Current + 1) % 256
    return result

and run our tests:


In [201]:
ears.run()


...........
11 pass, 0 fail, 0 error

Let's give it a try:


In [202]:
main(5)


That looks OK, but it's not really enough to tell us whether our coloring is working. Here's a larger grid:


In [203]:
main(21)


Right away we can see that the fractal generally grows up, but occasionally a new branch starts off on the side.

What if we want to reproduce a fractal? Recalling the start of this lesson, we can do this by reproducing the sequence of pseudo-random numbers, which simply means using the same seed each time. Let's modify main to take a random number seed as well as grid size, then generate half a dozen small fractals to test it:


In [221]:
def main(size, seed):
    '''Simulate diffusion-limited aggregation using a particular sequence of random values.'''
    random.seed(seed)
    grid = create_grid(size)
    filling = True
    while filling:
        fill_next(grid)
        if finished(grid):
            filling = False
    grid.show()

for i in range(10):
    print 'seed:', 1000 + i
    main(9, 1000 + i)


seed: 1000
seed: 1001
seed: 1002
seed: 1003
seed: 1004
seed: 1005
seed: 1006
seed: 1007
seed: 1008
seed: 1009

Wait a second: what happened to the fractal generated by the seed 1007? Why is its seed square black rather than red? And why is that only happening once in 10 runs?

Defensive Programming Revisited

The fact is, we currently don't have any way to check whether our fractals are "right" or not. They look plausible, but since we don't have anything physical to compare them against, that actually doesn't mean very much. What we've really been doing is saying that we think our code is right, so we're going to trust its output. Clearly, that's broken down.

So let's return to an earlier idea and see if we can add a few assertions to our code to check that it's working correctly. In particular, our moving particle should never find itself in a cell that's already filled:


In [227]:
def fill_next(grid):
    x = random.randint(0, grid.width-1)
    y = grid.height - 1
    while True:
        assert not filled(grid, x, y), 'Cell ({0},{1}) is already filled'.format(x, y)
        if in_contact(grid, x, y):
            grid[x, y] = make_color(False)
            return

        y -= 1
        x += random.randint(-1, 1)
        
        if not on_grid(grid, x, y):
            return

# Try the fractal that wasn't working.
main(9, 1007)


---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-227-bf9a910b861d> in <module>()
     15 
     16 # Try the fractal that wasn't working.
---> 17 main(9, 1007)

<ipython-input-221-b586526affb1> in main(size, seed)
      5     filling = True
      6     while filling:
----> 7         fill_next(grid)
      8         if finished(grid):
      9             filling = False

<ipython-input-227-bf9a910b861d> in fill_next(grid)
      3     y = grid.height - 1
      4     while True:
----> 5         assert not filled(grid, x, y), 'Cell ({0},{1}) is already filled'.format(x, y)
      6         if in_contact(grid, x, y):
      7             grid[x, y] = make_color(False)

AssertionError: Cell (4,0) is already filled

A bit of head-scratching reveals the problem. We are testing for filled neighbors in the four orthogonal directions (up, down, left, and right), but allowing particles to move diagonally down and left or down and right. That means we will sometimes believe that a cell is in free space when it is diagonally adjacent to a filled cell, and then move it onto that filled cell:

FIXME: diagram

We could say this is just a bug in our code, but it's also a bug in our scientific model. If we fix it by saying that particles can stick to the aggregate along diagonals, we will actually be simulating a different system from the one we started with. Alternatively, we can say that particles move either left, right, or down, but never diagonally. Once again, this is a different system than our original one; before we make that change to the code, we need to decide if that's the science we want to do.

After tossing a coin, we decide to change the way particles move rather than the way they stick. We also decide to put that decision in a function, since that will make our decision easier to revisit in future:


In [229]:
def fill_next(grid):
    x = random.randint(0, grid.width-1)
    y = grid.height - 1
    while True:
        assert not filled(grid, x, y), 'Cell ({0},{1}) is already filled'.format(x, y)
        if in_contact(grid, x, y):
            grid[x, y] = make_color(False)
            return
        x, y = move(x, y)
        if not on_grid(grid, x, y):
            return

def move(x, y):
    '''Move the particle currently at (x,y).'''
    direction = random.randint(0, 2)
    if direction == 0:
        return x-1, y
    elif direction == 1:
        return x, y-1
    else:
        return x+1, y

ears.run()


...........
11 pass, 0 fail, 0 error

Let's see if this has fixed our problem:


In [230]:
main(9, 1007)


And something larger:


In [231]:
main(31, 12345)


All right, that looks promising, but what can we do to guard against other errors? After all, we only spotted this one by chance.

One idea is to check the physical properties of the system we're simulating. In particular, every particle we inject into the system should either stick or fall off the map, i.e., mass is conserved. Let's try this:


In [235]:
def count_filled(grid):
    '''How many cells are in the aggregate?'''
    result = 0
    for x in range(grid.width):
        for y in range(grid.height):
            if filled(grid, x, y):
                result += 1
    return result

def test_initial_has_count_1():
    assert count_filled(create_grid(3)) == 1

def test_all_filled():
    grid = create_grid(3)
    for x in range(grid.width):
        for y in range(grid.height):
            grid[x,y] = make_color(False)
    assert count_filled(grid) == 9

ears.run()


.............
13 pass, 0 fail, 0 error

And then let's keep track of how many cells are actually filled, and how many should be:


In [237]:
def main(size, seed):
    '''Simulate diffusion-limited aggregation using a particular sequence of random values.'''
    random.seed(seed)
    grid = create_grid(size)
    stuck = 1 # the initial seed particle is stuck
    filling = True
    while filling:
        stuck += fill_next(grid) # add one to the count if this particle stuck
        if finished(grid):
            filling = False
    actual = count_filled(grid) # how many cells are actually filled?
    assert stuck == actual, '{0} should be stuck but {1} are'.format(stuck, actual)
    grid.show()

def fill_next(grid):
    x = random.randint(0, grid.width-1)
    y = grid.height - 1
    while True:
        assert not filled(grid, x, y), 'Cell ({0},{1}) is already filled'.format(x, y)
        if in_contact(grid, x, y):
            grid[x, y] = make_color(False)
            return 1 # one more cell filled
        x, y = move(x, y)
        if not on_grid(grid, x, y):
            return 0 # no cell was filled

Now let's try running a few simulations:


In [239]:
for i in range(10):
    print 'seed:', 1000 + i
    main(9, 1000+i)


seed: 1000
seed: 1001
seed: 1002
seed: 1003
seed: 1004
seed: 1005
seed: 1006
seed: 1007
seed: 1008
seed: 1009

This won't catch everything—in particular, if we ever stop a particle too early (i.e., when it's not connected to the aggregate), the conservation of mass check will still pass—but it's definitely a step forward. Let's end the lesson with one last fractal:


In [241]:
main(51, 12345)


Key Points

FIXME