Basic Programming Using Python: Repeating Things

Objectives

  • Explain what for loops are used for.
  • Correctly identify loop variables and explain when and how they are updated.
  • Trace the execution of simple and doubly-nested for loops.
  • Use the ipythonblocks library to create two-dimensional grids of colored cells.
  • Select elements and sections of those grids using simple and compound indices.

For Loops

Computers are useful because they can do lots of calculations on lots of data. To take advantage of this, we need a way to express many calculations with just a few statements. Let's start by printing the vowels in order the hard way:


In [2]:
print 'a'
print 'e'
print 'i'
print 'o'
print 'u'


a
e
i
o
u

Now let's do it the easy way:


In [3]:
for vowel in 'aeiou':
    print vowel


a
e
i
o
u

The keywords for and in are used to create a for loop, which tells the computer to execute one or more statements for each thing in some group. The indented line is called the body of the loop: it's what Python executes repeatedly. The variable vowel is the loop variable: each time the loop is executed, it is assigned the next value from the string 'aeiou'). There's nothing magical about the loop variable's name: we could call it x or fish, but as with all variables, using something meaningful makes our programs easier to understand.

Here's another loop that repeatedly updates a variable:


In [4]:
length = 0
for vowel in 'aeiou':
    length = length + 1
print 'There are', length, 'vowels'


There are 5 vowels

It's worth tracing the execution of this little program step by step. Since there are five characters in 'aeiou', the statement on line 3 will be executed five times. The first time around, length is zero (the value assigned to it on line 1) and vowel is 'a'. The statement adds 1 to the old value of length, producing 1, and updates length to refer to that new value. The next time around, vowel is 'e' and length is 1, so length is updated to be 2. After three more updates, length is 5; since there is nothing left in 'aeiou' for Python to process, the loop finishes and the print statement on line 4 tells us our final answer.


Finding the Length

Counting the number of characters in a string is such a common operation that Python has a built-in function to do it called `len`:


In [7]:
print 'There are', len('aeiou'), 'vowels'


There are 5 vowels

`len` is much faster than any function we could write ourselves, and much easier to read than a two-line loop; it will also give us the length of many other things that we haven't met yet, so we should always use it when we can.


Grids, Colors, and Help

Now that we can write loops, we need some more things to process with them. We could use images, but since it's hard to see individual pixels, we will use a library called ipythonblocks instead. Here's a simple example of it in action:


In [8]:
from ipythonblocks import ImageGrid
grid = ImageGrid(6, 3)
grid.show()


The first line of this program loads ImageGrid from the ipythonblocks library. The second line creates a 6×3 grid, and the third line asks that grid to display itself.

Like real images, image grids have properties:


In [9]:
print 'width:', grid.width
print 'height:', grid.height
print 'lines_on:', grid.lines_on


width: 6
height: 3
lines_on: True

We can change some of the grid's members:


In [10]:
grid.lines_on = False
grid.show()


but not others:


In [11]:
grid.width = 100


---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-11-4f0691382f8e> in <module>()
----> 1 grid.width = 100

AttributeError: can't set attribute

In this case, the library won't let us change the grid's size because it would have to guess what color to use if we added more rows and columns. But ImageGrid can do many more things; to find out what, we can ask for help:


In [12]:
help(grid)


Help on ImageGrid in module ipythonblocks.ipythonblocks object:

class ImageGrid(BlockGrid)
 |  A grid of blocks whose colors can be individually controlled.
 |  
 |  Parameters
 |  ----------
 |  width : int
 |      Number of blocks wide to make the grid.
 |  height : int
 |      Number of blocks high to make the grid.
 |  fill : tuple of int, optional
 |      An optional initial color for the grid, defaults to black.
 |      Specified as a tuple of (red, green, blue). E.g.: (10, 234, 198)
 |  block_size : int, optional
 |      Length of the sides of grid blocks in pixels. One is the lower limit.
 |  lines_on : bool, optional
 |      Whether or not to display lines between blocks.
 |  origin : {'lower-left', 'upper-left'}
 |      Set the location of the grid origin.
 |  
 |  Attributes
 |  ----------
 |  width : int
 |      Number of blocks along the width of the grid.
 |  height : int
 |      Number of blocks along the height of the grid.
 |  shape : tuple of int
 |      A tuple of (width, height).
 |  block_size : int
 |      Length of the sides of grid blocks in pixels.
 |  lines_on : bool
 |      Whether lines are shown between blocks when the grid is displayed.
 |      This attribute can used to toggle the whether the lines appear.
 |  origin : str
 |      The location of the grid origin.
 |  
 |  Method resolution order:
 |      ImageGrid
 |      BlockGrid
 |      __builtin__.object
 |  
 |  Methods defined here:
 |  
 |  __getitem__(self, index)
 |  
 |  __init__(self, width, height, fill=(0, 0, 0), block_size=20, lines_on=True, origin='lower-left')
 |  
 |  __iter__(self)
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |  
 |  block_size
 |  
 |  origin
 |  
 |  ----------------------------------------------------------------------
 |  Methods inherited from BlockGrid:
 |  
 |  __setitem__(self, index, value)
 |  
 |  __str__(self)
 |  
 |  animate(self, stop_time=0.2)
 |      Call this method in a loop definition to have your changes to the grid
 |      animated in the IPython Notebook.
 |      
 |      Parameters
 |      ----------
 |      stop_time : float
 |          Amount of time to pause between loop steps.
 |  
 |  copy(self)
 |      Returns an independent copy of this BlockGrid.
 |  
 |  flash(self, display_time=0.2)
 |      Display the grid for a time.
 |      
 |      Useful for making an animation or iteratively displaying changes.
 |      
 |      Parameters
 |      ----------
 |      display_time : float
 |          Amount of time, in seconds, to display the grid.
 |  
 |  show(self)
 |      Display colored grid as an HTML table.
 |  
 |  to_text(self, filename=None)
 |      Write a text file containing the size and block color information
 |      for this grid.
 |      
 |      If no file name is given the text is sent to stdout.
 |      
 |      Parameters
 |      ----------
 |      filename : str, optional
 |          File into which data will be written. Will be overwritten if
 |          it already exists.
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from BlockGrid:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)
 |  
 |  height
 |  
 |  lines_on
 |  
 |  shape
 |  
 |  width

We don't know enough yet to use all of these right now—particularly not the ones with double underscores in their names—but we'll explore most of them in coming lessons. Before we do that, though, we need to make our grids more exciting by coloring in some of the cells, and to do that, we need a way to specify colors.

The most common color scheme is called RGB, which defines colors according to how much red, green, and blue they contain. This is an additive color model: the color we see is the sum of the individual color values, each of which can range between 0 and 255.


Bytes and Colors

Color values go up to 255 because computer memory is organized into 8-bit bytes, and 255 (or 11111111 in base 2) is the largest integer that can be represented in one byte. Storing each color component in a single byte is a good balance between being able to represent enough colors to fool the human eye and being able to store color values efficiently.

In RGB, black is (0, 0, 0), i.e., nothing of any color, and white is the maximum value of all three colors, or (255, 255, 255). We can think of the other colors being arranged in cube: the three axes represent the primary colors, while secondary colors are combinations of maximum values, and each actual color is a coordinate in this cube.


The IPythonBlocks library includes a function called show_color that we can use to explore RGB colors:


In [13]:
from ipythonblocks import show_color
show_color(255, 0, 0) # all red



In [14]:
show_color(0, 255, 0) # all green
show_color(0, 0, 255) # all blue



In [15]:
show_color(50, 100, 200) # a soothing shade of blue-gray


We're finally ready to change the color of one cell of our grid:


In [16]:
grid.lines_on = True
grid[0, 0] = (0, 255, 0)
grid.show()


There's a lot going on in that assignment statement:

  1. A color is one value with three components, so we write it as a tuple, just as we wrote the (x,y) size of an image in the previous lessons.
  2. We use an index (sometimes called a subscript) to refer to a particular cell in a grid, just as we do when referring to a particular element of a matrix in mathematics such as Ai,j. We write the index in square brackets rather than as an actual subscript because text editors didn't support fancy typesetting back in the 1950s.
  3. The corner's coordinates are (0, 0), not (1, 1). Programming languages derived from C (a family that includes Python, Perl, and Java) all count from 0 for the same reason that color values run from 0 to 255 instead of 1 to 256. Some other languages (notably Fortran, MATLAB, and R) count from 1. The latter is more natural—nobody except a computer scientists says, "Zero, one, two, three, four," when counting their fingers—but zero-based counting does have a few small advantages, and even if it didn't, we're stuck with it.


Block Grids

The `ipythonblocks` library has another class called `BlockGrid` that has all the same capabilities as `ImageGrid`, but which uses the same coordinate scheme as tables and spreadsheets: the first index counts down the rows, while the second counts across the columns. We'll use `ImageGrid` in all of our examples to be consistent with the way `skimage.novice` refers to pixels.


Looping Over Grids

Suppose we want to turn all the pixels in the first column of a grid orange. Once again, we'll do it the wrong way first:


In [17]:
rectangle = ImageGrid(6, 3)
rectangle[0, 0] = (255, 128, 0)
rectangle[0, 1] = (255, 128, 0)
rectangle[0, 2] = (255, 128, 0)
rectangle.show()


Why is this wrong?

  1. We are changing the color of each cell manually. If we had a 100×200 grid, that would mean writing 200 lines of code.
  2. We are writing out our color three times. If we want to change the shade of orange we're using, we'll have to modify each of those lines.
  3. We must be careful to write exactly as many assignment statements as there are rows in the grid. If we ever change the grid size, we'll have to add or remove assignment statements.

Let's fix the second problem first:


In [18]:
rectangle = ImageGrid(6, 3)
orange = (255, 128, 0)
rectangle[0, 0] = orange
rectangle[0, 1] = orange
rectangle[0, 2] = orange
rectangle.show()


This version is one line longer, but it's much more readable: putting the color value in a variable, then referring to that variable everywhere we want the particular shade of orange, makes it clear that we're using exactly the same color for each cell.

Here's our first attempt at writing one assignment statement in a loop instead of multiple assignment statements:


In [19]:
rectangle = ImageGrid(6, 3)
orange = (255, 128, 0)
for y in ...something...:
    rectangle[0, y] = orange
rectangle.show()


  File "<ipython-input-19-49fc77ace1ad>", line 3
    for y in ...something...:
             ^
SyntaxError: invalid syntax

Our plan is to use a loop to set a variable y to the values 0, 1, and 2, and then to assign orange to grid[0, y] for each of those values. But what should we use in place of ...something...? By analogy with our loop over a character string, we want a single collection that contains the numbers 0, 1, and 2. Fortunately, Python has a built-in function called range that generates exactly that sequence:


In [20]:
for number in range(3):
    print number


0
1
2

Note the relationship between the parameter to range and the sequence of numbers it creates: if we ask for range(N), we get 0, 1, 2, …, N-1, which is exactly the legal indices for something with N elements:


In [21]:
vowels = 'aeiou'
length = len(vowels)
for i in range(length):
    print i, vowels[i]


0 a
1 e
2 i
3 o
4 u

Let's use range to make our loop work:


In [22]:
rectangle = ImageGrid(6, 3)
orange = (255, 128, 0)
for y in range(3):
    rectangle[0, y] = orange
rectangle.show()


This is real progress: as the example below shows, the same number of statements will work for a larger grid:


In [23]:
larger = ImageGrid(20, 8)
purple = (200, 0, 200)
for y in range(8):
    larger[0, y] = purple
larger.show()


And we can now fix the third of our problems, namely, the fact that we have to change the parameter to range each time we change the size of the grid. Since the grid knows how big it is, we can use its size as that parameter so that it will always be exactly the right value:


In [24]:
final = ImageGrid(15, 6)
teal = (0, 180, 180)
for y in range(final.height):
    final[0, y] = teal
final.show()



DRY

These improvements in our program are based on a design principle called [DRY](glossary.html#dry): Don't Repeat Yourself. Every fact in a program should be written down exactly once, and every part of the program that needs that fact should refer to that definitive value. Doing this helps ensure that the program is always internally consistent (i.e., if it's wrong, it's wrong in the same way everywhere).


In fact, we can now put what we've done in a function that will set the color of any column we want:


In [25]:
def color_column(grid, x, color):
    for y in range(grid.height):
        grid[x, y] = color

first = ImageGrid(10, 4)
color_column(first, 2, (200, 100, 50))
first.show()


And then:


In [26]:
second = ImageGrid(20, 3)
brown = (200, 100, 50)
color_column(second, 5, brown)
color_column(second, 10, brown)
color_column(second, 15, brown)
second.show()



When Short Names Are OK

We said earlier that programs should use meaningful variable names. We are violating that rule by using `x` and `y` as variables in this function, but it's a defensible violation. Suppose we re-write our loop as:


In [27]:
def color_column(grid, x_axis_index, color):
    for y_axis_index in range(grid.height):
        grid[x_axis_index, y_axis_index] = color

The longer names are more meaningful, but they also takes longer to read. Since these variables only used for a few lines, users will easily be able to keep their meaning in short-term memory as long as they need to. On balance, therefore, the short name are better in this case.

In general, a variable that holds a simple value and is only used in a few adjacent lines of code can (and usually should) have a short name. A variable that holds a complex value, or one which is used over more than a few lines of code, should have a longer name in order to optimize the tradeoff between reading speed and the limitations of human short-term memory. </em>


Nested Loops

What if we want to color in a rectangle of cells rather than a single column? For example, suppose we want to turn the nine pixels in the lower left corner blue. We could do this:

grid[0, 0] = (0, 0, 255)
grid[0, 1] = (0, 0, 255)
grid[0, 2] = (0, 0, 255)
grid[1, 0] = (0, 0, 255)
grid[1, 1] = (0, 0, 255)
grid[1, 2] = (0, 0, 255)
grid[2, 0] = (0, 0, 255)
grid[2, 1] = (0, 0, 255)
grid[2, 2] = (0, 0, 255)

but we already know that's a bad approach. We'd be better off using a loop for each row to cut our program from nine lines to six:


In [28]:
grid = ImageGrid(8, 5)
for y in range(3):
    grid[0, y] = (0, 0, 255)
for y in range(3):
    grid[1, y] = (0, 0, 255)
for y in range(3):
    grid[2, y] = (0, 0, 255)
grid.show()


But look more closely: the only difference between the three loops is the row index used in each, and those indices go in sequence 0, 1,and 2. That means we can use another loop to produce them:


In [29]:
for x in range(3):
    for y in range(3):
        grid[x, y] = (0, 0, 255)
grid.show()


This is called a nested loop. Each time the outer loop runs once, the inner loop runs three times, so that every possible combination of row and column index is seen exactly once:

In this case we get the same result if we turn the loops inside out and run over each row once for each column, instead of each column once for each row—the only difference is the order in which blocks are colored in. There are cases where the order of the loops matters. For example, the range of the inner loop in the program below depends on the current value of the outer loop's index:


In [30]:
square = ImageGrid(5, 5)
for x in range(square.width):
    for y in range(x): # Note: y values depend on current value of x
        square[x, y] = (0, 128, 255)
square.show()


The diagram below shows the order in which these cells are colored in:

If we invert the loops—i.e., if we loop over all values of y but make the range of x depend on the value of y—then we get the following pattern instead:


In [31]:
for y in range(square.height):
    for x in range(y): # This time x depends on y.
        square[x, y] = (255, 128, 0)
square.show()


And if we want, we can use a single index for both our X and Y value to color in the diagonal:


In [32]:
for i in range(square.height):
    square[i, i] = (100, 200, 100)
square.show()


Standard Colors

Let's make one more small change to our program:


In [33]:
from ipythonblocks import colors

grid = ImageGrid(7, 4, fill=colors['Green'])
for y in range(grid.height):
    for x in range(y):
        grid[x, y] = colors['Aqua']
grid.show()


ipythonblocks.colors is a collection of 140 named colors. These are just the three-part tuples we've been using:


In [34]:
print 'Orange:', colors['Orange']
print 'Yellow:', colors['Yellow']


Orange: (255, 165, 0)
Yellow: (255, 255, 0)

but using colors from the library saves us having to experiment with RGB values, and helps keep our programs consistent with other people's. We can use these colors, or our own, both to set the colors of individual cells, and to specify an initial color for all the cells in a grid using the optional fill parameter.

Key Points

  • for variable in collection repeats one or more statements once for each item in a collection
    • The loop variable is assigned the values from the collection one at a time.
  • grid[x, y] refers to a single cell in a two-dimensional grid.
  • A tuple (red, green, blue) specifies a color.
    • Each component must be in the range 0…255.
  • Every fact in a program should be expressed exactly once.
    • Every other part of the program should refer to it rather than duplicating it.
  • The function call range(N) produces the values 0, 1, 2, …, N-1.
  • Use nested loops to process a two-dimensional range of values.
    • The range of the inner loop can depend on the current value of the outer loop variable.