Based on the Multimedia Programming lesson at Software Carpentry.
In [ ]:
    
%pylab inline
import numpy as np
import matplotlib.pyplot as plt
    
Reading in and manipulating image data can be accomplished with NumPy or another built-in image library designed for image processing, such as the "Python Imaging Library" or PIL.
For example, we might want to read in a beautiful image of a star field taken by the Hubble Space Telescope. Have a look at some of these images:
Recall that color images are just arrays of numbers representing pixel locations and the color depth of each of Red, Green, and Blue.  The different file formats (jpg, png, tiff, etc.) also contain additional information (headers, for example) that an image reading function needs to know about.  We can use PIL to read image data into NumPy arrays.  If we want to fetch a file off the web, we also need some help from the requests and StringIO libraries:
In [ ]:
    
from PIL import Image
import requests
from StringIO import StringIO
    
In [ ]:
    
#Pick an image from the list above and fetch it with requests.get.  
#It's okay to do a copy/paste here.
#The default picture here is of M45 - the Pleiades Star Cluster.
response = requests.get("http://imgsrc.hubblesite.org/hu/db/images/hs-2004-20-a-large_web.jpg")
pic = Image.open(StringIO(response.content))
    
Plot the image using matplotlib:
In [ ]:
    
plt.imshow(pic);
    
Examine image properties
In [ ]:
    
pic.format
    
Pixel coordinates are (x,y) tuples with (0,0) the upper left corner because that's how old CRT monitors drew things.
In [ ]:
    
pic.size
    
Colors are represented by RGB triples. Black is (0,0,0), White is (255, 255, 255) or (0xFF, 0xFF, 0xFF) in hexadecimal. Think of it as a color cube with the three axes representing the different possible colors. The furthest away from the origin (black) is white.
In [ ]:
    
#Color tuple (R,G,B) of very first pixel
pic.getpixel((0,0))
    
In [ ]:
    
xsize, ysize = pic.size
max_val = 0
for x in range(xsize):
    for y in range(ysize):
        r,g,b = pic.getpixel((x,y))
        if r+g+b > max_val:
            bx, by, max_val = x, y, r+g+b
print (bx,by), max_val
    
By comparison, the greatest possible value is 3 * 255, or 765.
Encapsulate this code into a function that could be used on any picture:
In [ ]:
    
def brightest(picture):
    """add up each pixel's color values 
    to approximate its overall luminance"""
    xsize, ysize = picture.size
    bx, by, max_val = 0, 0, 0
    for x in range(xsize):
        for y in range(ysize):
            r,g,b = pic.getpixel((x,y))
            if r+g+b > max_val:
                bx, by, max_val = x, y, r+g+b
    return (bx, by), max_val
    
In [ ]:
    
from time import time
def elapsed(func, picture):
    """takes a function and a picture as arguments, 
    applies the function to the picture, and returns 
    the elapsed time along with whatever the function 
    itself returned."""    
    start = time()
    result = func(picture)
    return time() - start, result
    
Now run it with your picture data:
In [ ]:
    
print elapsed(brightest,pic)
    
We could process the information faster if we use the PIL getdata function to unpack the row-and-column representation of the image to create a vector of pixels, and then loops over that.
In [ ]:
    
def faster(picture):
    """This function, 'faster', uses 
    'picture.getdata' to unpack the 
    row-and-column representation of the 
    image to create a vector of pixels, 
    and then loops over that."""
    max_val = 0
    for (r,g,b) in picture.getdata():
        if r+g+b > max_val:
            max_val = r + g + b
    return max_val
    
In [ ]:
    
print elapsed(faster,pic)
    
This is faster because the pixels are unpacked into a 1-D array row-by-row.  This function is more than nine times faster than its predecessor, partly because we are not translating between (x,y) coordinates and pixel locations in memory over and over again, and partly because the 'getdata' method unpacks the pixels to make them more accessible.
But now we don't have the coordinates of the brightest pixel.  We could calculate it from the location in the 1-D array but that is a bit of a pain.  A useful alternative is to call 'picture.load', which unpacks the picture's pixels in memory, so that you can index the picture as if it was an array.
In [ ]:
    
def in_between(picture):
    xsize, ysize = picture.size
    temp = picture.load()
    bx, by, max_val = 0, 0, 0
    for x in range(xsize):
        for y in range(ysize):
            r,g,b = temp[x,y]
            if r+g+b > max_val:
                bx, by, max_val = x,y,r+g+b
    return (bx, by), max_val
    
In [ ]:
    
print elapsed(in_between,pic)
    
Note: If the picture can be read in as a NumPy array, you can use masks and aggregates to make this even faster. There is overhead in creating the NumPy array from the image but once it is an array, the operations are lightning fast because they don't require loops.
In [ ]:
    
#create numpy array of image data
myimg = np.array(pic.getdata(), np.uint8).reshape(pic.size[1], pic.size[0], 3)
#find max pixel with aggregates
def with_numpy(picture):
  return picture.sum(axis=2).max()
print elapsed(with_numpy,myimg)
    
Which of the forms you should use in a particular situation depends on what information you need from the image, what format they are, and how big the images you're working with are.
Let's use what we've learned to count stars in our image. Start by converting the image to B/W, so that which pixels belong to stars and which don't is unambiguous. We'll use black for stars and white for background, since it's easier to see black-on-white than the reverse.
In [ ]:
    
def monochrome(picture, threshold):
    """loops over the pixels in the loaded image, 
    replacing the RGB values of each with either 
    black or white depending on whether its total 
    luminance is above or below some threshold 
    passed in by the user"""
    black = (0, 0, 0)
    white = (255, 255, 255)
    xsize, ysize = picture.size
    temp = picture.load()
    for x in range(xsize):
        for y in range(ysize):
            r,g,b = temp[x,y]
            if r+g+b >= threshold: 
                temp[x,y] = black
            else:
                temp[x,y] = white
    
Could you do this faster with masks?
In [ ]:
    
#Get another copy to convert to B/W
bwpic = Image.open(StringIO(response.content))
#Remember, this threshold is a scalar, not an RGB triple
#we're looking for pixels whose total color value is 600 or greater
monochrome(bwpic,200+200+200)
plt.imshow(bwpic);
    
Now we can start counting stars by counting "blobs" of connected or adjacent black pixels.
Decide what we mean by "adjacent": sharing sides of square pixels but not corners. i.e. directly above, below, left, or right from each other, not if they are touching diagonally.
Count each one just once.
How do we tell whether the pixel we're looking at is part of a new blob or not?
We could mark counted pixels by turning black ones red. (Hmm...sounds a bit like ipythonblocks! In fact, ipythonblocks can also read in images and manipulate them as ImageGrid objects!)
If the pixel we're looking at touches one that has already been turned red, then it's part of a blob we have already counted. We'll turn it red to show that we have looked at it, but we won't count it as a star, since it belongs to a star we've already counted.
Let's see how to implement such an algorithm.
First, implement a function to do our counting:
In [ ]:
    
BLACK = (0,0,0)
RED = (255,0,0)
def count(picture):
    """scan the image top to bottom and left to right using a nested loop.
    when black pixel is found, increment the count, then call the fill
    function to fill in all the pixels connected to that one."""
    xsize, ysize = picture.size
    temp = picture.load()
    result = 0
    for x in range(xsize):
        for y in range(ysize):
            if temp[x,y] == BLACK:
                result += 1
                fill(temp,xsize,ysize,x,y)
    return result
    
Uh... we don't have a "fill" function yet.  What would it look like?  Maybe we could set it up as follows:
Keep list of (x,y) coordinates to be examined (the "queue").
Loop until queue is empty:
Here's what it might look like:
In [ ]:
    
def fill(picture, xsize, ysize, xstart, ystart):
    """keep a list of pixels that need to be looked at, 
    but haven't yet been filled in - a list of the (x,y) 
    coordinates of pixels that are neighbors of ones we have 
    already examined.  Keep looping until there's nothing 
    left in this list"""
    queue = [(xstart,ystart)]
    #print "queue start:",queue
    qcount = 0
    while queue:
        #print qcount,": ",queue
        x,y,queue = queue[0][0], queue[0][1], queue[1:]
        if picture[x,y] == BLACK:
            picture[x,y] = RED
            if x > 0:
                queue.append((x-1,y))
            if x < (xsize-1):
                queue.append((x+1,y))
            if y > 0:
                queue.append((x, y-1))
            if y < (ysize-1):
                queue.append((x, y+1))
        qcount+=1
    
In [ ]:
    
count(bwpic)
    
Here's the 'fill' algorithm in action.
Write a variation on the count function to keep track of the pixels in a given blob and use the brightest pixel in the blob as the "location" of the star. Use NumPy's I/O functions to write out a text file containing
* the link to the image
* the threshold value used in the conversion to monochrome
* the structuring element used to count its stars
* how many stars were found
* a description of the data in the subsequent columns
Apply your function to the six images at the top of the page.
You can also review the accompanying notebook Counting Stars with NumPy to see how much of this project can be streamlined using the efficiencies provided with NumPy arrays.