Based on the Multimedia Programming lesson at Software Carpentry.
Instructions: Create a new directory called CountingStars
with a new notebook called CountingStarsTour
. Give it a heading 1 cell title Counting Stars Tour. Read this page, typing in the code in the code cells and executing them as you go.
Do not copy/paste.
Type the commands yourself to get the practice doing it. This will also slow you down so you can think about the commands and what they are doing as you type them.</font>
Save your notebook when you are done, then try the accompanying exercises.
In [ ]:
%pylab inline
import numpy as np
import matplotlib.pyplot as plt
Reading in and manipulating image data can be accomplished with NumPy or another built-in image library designed for image processing, such as the "Python Imaging Library" or PIL.
For example, we might want to read in a beautiful image of a star field taken by the Hubble Space Telescope. Have a look at some of these images:
Recall that color images are just arrays of numbers representing pixel locations and the color depth of each of Red, Green, and Blue. The different file formats (jpg
, png
, tiff
, etc.) also contain additional information (headers, for example) that an image reading function needs to know about. We can use PIL to read image data into NumPy arrays. If we want to fetch a file off the web, we also need some help from the requests
and StringIO
libraries:
In [ ]:
from PIL import Image
import requests
from StringIO import StringIO
In [ ]:
#Pick an image from the list above and fetch it with requests.get.
#It's okay to do a copy/paste here.
#The default picture here is of M45 - the Pleiades Star Cluster.
response = requests.get("http://imgsrc.hubblesite.org/hu/db/images/hs-2004-20-a-large_web.jpg")
pic = Image.open(StringIO(response.content))
Plot the image using matplotlib
:
In [ ]:
plt.imshow(pic);
Examine image properties
In [ ]:
pic.format
Pixel coordinates are (x,y) tuples with (0,0) the upper left corner because that's how old CRT monitors drew things.
In [ ]:
pic.size
Colors are represented by RGB triples. Black is (0,0,0), White is (255, 255, 255) or (0xFF, 0xFF, 0xFF) in hexadecimal. Think of it as a color cube with the three axes representing the different possible colors. The furthest away from the origin (black) is white.
In [ ]:
#Color tuple (R,G,B) of very first pixel
pic.getpixel((0,0))
In [ ]:
xsize, ysize = pic.size
max_val = 0
for x in range(xsize):
for y in range(ysize):
r,g,b = pic.getpixel((x,y))
if r+g+b > max_val:
bx, by, max_val = x, y, r+g+b
print (bx,by), max_val
By comparison, the greatest possible value is 3 * 255, or 765.
Encapsulate this code into a function that could be used on any picture:
In [ ]:
def brightest(picture):
"""add up each pixel's color values
to approximate its overall luminance"""
xsize, ysize = picture.size
bx, by, max_val = 0, 0, 0
for x in range(xsize):
for y in range(ysize):
r,g,b = pic.getpixel((x,y))
if r+g+b > max_val:
bx, by, max_val = x, y, r+g+b
return (bx, by), max_val
In [ ]:
from time import time
def elapsed(func, picture):
"""takes a function and a picture as arguments,
applies the function to the picture, and returns
the elapsed time along with whatever the function
itself returned."""
start = time()
result = func(picture)
return time() - start, result
Now run it with your picture data:
In [ ]:
print elapsed(brightest,pic)
We could process the information faster if we use the PIL getdata
function to unpack the row-and-column representation of the image to create a vector of pixels, and then loops over that.
In [ ]:
def faster(picture):
"""This function, 'faster', uses
'picture.getdata' to unpack the
row-and-column representation of the
image to create a vector of pixels,
and then loops over that."""
max_val = 0
for (r,g,b) in picture.getdata():
if r+g+b > max_val:
max_val = r + g + b
return max_val
In [ ]:
print elapsed(faster,pic)
This is faster because the pixels are unpacked into a 1-D array row-by-row. This function is more than nine times faster than its predecessor, partly because we are not translating between (x,y) coordinates and pixel locations in memory over and over again, and partly because the 'getdata
' method unpacks the pixels to make them more accessible.
But now we don't have the coordinates of the brightest pixel. We could calculate it from the location in the 1-D array but that is a bit of a pain. A useful alternative is to call 'picture.load
', which unpacks the picture's pixels in memory, so that you can index the picture as if it was an array.
In [ ]:
def in_between(picture):
xsize, ysize = picture.size
temp = picture.load()
bx, by, max_val = 0, 0, 0
for x in range(xsize):
for y in range(ysize):
r,g,b = temp[x,y]
if r+g+b > max_val:
bx, by, max_val = x,y,r+g+b
return (bx, by), max_val
In [ ]:
print elapsed(in_between,pic)
Note: If the picture can be read in as a NumPy array, you can use masks and aggregates to make this even faster. There is overhead in creating the NumPy array from the image but once it is an array, the operations are lightning fast because they don't require loops.
In [ ]:
#create numpy array of image data
myimg = np.array(pic.getdata(), np.uint8).reshape(pic.size[1], pic.size[0], 3)
#find max pixel with aggregates
def with_numpy(picture):
return picture.sum(axis=2).max()
print elapsed(with_numpy,myimg)
Which of the forms you should use in a particular situation depends on what information you need from the image, what format they are, and how big the images you're working with are.
Let's use what we've learned to count stars in our image. Start by converting the image to B/W, so that which pixels belong to stars and which don't is unambiguous. We'll use black for stars and white for background, since it's easier to see black-on-white than the reverse.
In [ ]:
def monochrome(picture, threshold):
"""loops over the pixels in the loaded image,
replacing the RGB values of each with either
black or white depending on whether its total
luminance is above or below some threshold
passed in by the user"""
black = (0, 0, 0)
white = (255, 255, 255)
xsize, ysize = picture.size
temp = picture.load()
for x in range(xsize):
for y in range(ysize):
r,g,b = temp[x,y]
if r+g+b >= threshold:
temp[x,y] = black
else:
temp[x,y] = white
Could you do this faster with masks?
In [ ]:
#Get another copy to convert to B/W
bwpic = Image.open(StringIO(response.content))
#Remember, this threshold is a scalar, not an RGB triple
#we're looking for pixels whose total color value is 600 or greater
monochrome(bwpic,200+200+200)
plt.imshow(bwpic);
Now we can start counting stars by counting "blobs" of connected or adjacent black pixels.
Decide what we mean by "adjacent": sharing sides of square pixels but not corners. i.e. directly above, below, left, or right from each other, not if they are touching diagonally.
Count each one just once.
How do we tell whether the pixel we're looking at is part of a new blob or not?
We could mark counted pixels by turning black ones red. (Hmm...sounds a bit like ipythonblocks! In fact, ipythonblocks can also read in images and manipulate them as ImageGrid objects!)
If the pixel we're looking at touches one that has already been turned red, then it's part of a blob we have already counted. We'll turn it red to show that we have looked at it, but we won't count it as a star, since it belongs to a star we've already counted.
Let's see how to implement such an algorithm.
First, implement a function to do our counting:
In [ ]:
BLACK = (0,0,0)
RED = (255,0,0)
def count(picture):
"""scan the image top to bottom and left to right using a nested loop.
when black pixel is found, increment the count, then call the fill
function to fill in all the pixels connected to that one."""
xsize, ysize = picture.size
temp = picture.load()
result = 0
for x in range(xsize):
for y in range(ysize):
if temp[x,y] == BLACK:
result += 1
fill(temp,xsize,ysize,x,y)
return result
Uh... we don't have a "fill
" function yet. What would it look like? Maybe we could set it up as follows:
Keep list of (x,y) coordinates to be examined (the "queue").
Loop until queue is empty:
Here's what it might look like:
In [ ]:
def fill(picture, xsize, ysize, xstart, ystart):
"""keep a list of pixels that need to be looked at,
but haven't yet been filled in - a list of the (x,y)
coordinates of pixels that are neighbors of ones we have
already examined. Keep looping until there's nothing
left in this list"""
queue = [(xstart,ystart)]
#print "queue start:",queue
qcount = 0
while queue:
#print qcount,": ",queue
x,y,queue = queue[0][0], queue[0][1], queue[1:]
if picture[x,y] == BLACK:
picture[x,y] = RED
if x > 0:
queue.append((x-1,y))
if x < (xsize-1):
queue.append((x+1,y))
if y > 0:
queue.append((x, y-1))
if y < (ysize-1):
queue.append((x, y+1))
qcount+=1
In [ ]:
count(bwpic)
Here's the 'fill
' algorithm in action.
Another alternative, which isn't as intuitive to beginning programmers, is to use a recursive algorithm. This keeps the work to be done on the call stack, instead of maintaining an explicit work queue.
Have a look at fill_rec
. It starts by checking the pixel it has been asked to look at. If that pixel is not black, fill_rec
returns immediately: there is nothing for it to do. If the pixel is black, fill_rec
turns it red and then calls fill_rec
on its left-hand neighbor (if it has one). It then goes on to call fill_rec
for the right-hand, top, and bottom neighbors as well. This has the same effect as using an explicit work queue; each call to fill_rec
takes care of one pixel, then calls fill_rec
again to take care of the neighbors.
In [ ]:
def fill_rec(picture, xsize, ysize, x, y):
"""each call to 'fillrec' takes care of one pixel,
then calls 'fillrec' again to take care of the neighbors"""
if picture[x,y] != BLACK:
return
picture[x,y] = RED
if x > 0:
fill_rec(picture, xsize, ysize, x-1, y)
if x < (xsize-1):
fill_rec(picture, xsize, ysize, x+1, y)
if y > 0:
fill_rec(picture, xsize, ysize, x, y-1)
if y < (ysize-1):
fill_rec(picture, xsize, ysize, x, y+1)
The logic can feel a bit circular but recursive algorithms are very powerful.