This week, we're moving into image processing. In this lecture, we'll touch on some core concepts around computer vision, or the field dedicated to machine understanding of images. By the end of this lecture, you should be able to
Whenever you hear about or refer to an image analysis task, you've stepped firmly into territory occupied by computer vision, or the field of research associated with understanding images and designing algorithms to do the same.
You can probably name numerous examples of computer vision already, but just to highlight a couple:
This is all to underscore: computer vision is an extremely active area of research and application!
From the perspective of the computer, the simplest constituent of an image is a pixel.
A pixel is a picture element.
(There are many other image formats and representations, but they tend to be variations on this theme)
In either grayscale or color, the pixels are arranged in rectangular arrays, one for each color channel (1 for grayscale, 3 for RGB).
(What could these arrays possibly be in Python?)
Let's jump in and get our hands dirty! First, let's use a relevant image:
I've stored this image in the course GitHub repository under lectures/Lecture22
( https://github.com/eds-uga/cbio4835-sp17 ) if you're interested.
Here's how to load the images in Python:
In [1]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
In [2]:
# Loads the image (just like a text file!)
img = mpimg.imread("Lecture22/image1.png")
print(type(img))
Just a regular NumPy array!
Let's see if we can visualize it.
In [3]:
plt.imshow(img)
Out[3]:
This shows the whole image, all three channels.
As evidenced by the .shape
property of the NumPy array, there are three dimensions to this image:
Each slice of the third dimension is a color channel, of which there are 3: one for red, one for green, and one for blue (hence: RGB).
We can plot them separately!
In [4]:
# First, separate out the channels.
r = img[:, :, 0]
g = img[:, :, 1]
b = img[:, :, 2]
# Now, plot each channel separately.
f = plt.figure(figsize = (12, 6))
f.add_subplot(1, 3, 1)
plt.imshow(np.array(r), cmap = "gray")
f.add_subplot(1, 3, 2)
plt.imshow(np.array(g), cmap = "gray")
f.add_subplot(1, 3, 3)
plt.imshow(np.array(b), cmap = "gray")
Out[4]:
For comparison, here's the original:
In [5]:
plt.imshow(img)
Out[5]:
Image analysis of any kind is usually done on a single channel.
(Why?)
Since images are stored as NumPy arrays, all the usual NumPy functionality (besides slicing, as we saw earlier) is available to you.
In [6]:
print(np.max(img))
print(np.min(img))
In [7]:
print(np.mean(img))
print(np.median(img))
In [8]:
print(np.median(r))
print(np.median(g))
print(np.median(b))
Recall that our img
object was loaded from a PNG image; this is the only format type that Matplotlib natively supports (more on that later).
When you read an image into Python, it will automatically detect the format and read it into the closest approximate Python data format it can. However, you can always manually convert it once it's in Python.
For instance, we use a slightly different approach to instead read in our image as grayscale:
In [9]:
import scipy.ndimage as ndimg
img_gray = ndimg.imread("Lecture22/image1.png", flatten = True) # The "flatten" arg is critical
print(img_gray.shape)
Note how there are only 2 dimensions now--just a height and width.
There is no need for a 3rd dimension because there's only 1 channel: luminescence, or grayscale intensity.
In [10]:
plt.imshow(img_gray, cmap = "gray")
Out[10]:
We can access individual pixels, just as you would individual elements of a matrix NumPy array (because that's all it is):
In [11]:
print(img_gray[100, 200])
In [12]:
print(img_gray[150, :])
In [13]:
print(np.max(img_gray[:, 400]))
If you so desire, you can even modify the pixel values directly, again just as you would for a regular NumPy array.
Fair warning: doing this alters the image! You may want to copy the image structure first...
In [14]:
for i in range(img_gray.shape[0]):
for j in range(120, 130):
img_gray[i, j] = 255
plt.imshow(img_gray, cmap = "gray")
Out[14]:
Another very useful way of obtaining information about an image is to view the histogram of pixel values.
You can do this regardless of whether it's a grayscale or RGB image, though in the latter case it's useful to plot the pixel values separated by channel.
First, let's re-import the image as grayscale and take a look at how the pixel values show up in a histogram:
In [15]:
img_gray = ndimg.imread("Lecture22/image1.png", flatten = True)
_ = plt.hist(img_gray.flatten(), bins = 25)
This tells us some very useful information--primarily, that most of the pixel values are centered around what seems like a pretty low number (20-30), so by and large the image is very dark (which we saw).
There do seem to be a few light spots on an island around 120-140, but that's it.
Let's take a look now at each channel individually.
In [16]:
fig = plt.figure(figsize = (16, 4))
plt.subplot(131)
_ = plt.hist(r.flatten(), bins = 25, range = (0, 1), color = 'r')
plt.subplot(132)
_ = plt.hist(g.flatten(), bins = 25, range = (0, 1), color = 'g')
plt.subplot(133)
_ = plt.hist(b.flatten(), bins = 25, range = (0, 1), color = 'b')
Recall what each channel represented:
There seems to be very little HSP27, while there is tons of actin and the quantity of DAPI falls somewhere in between.
...oh wait, did you see the scales for each one?
In [17]:
x = np.linspace(0, 1, 25)
plt.plot(x, np.histogram(r.flatten(), bins = 25)[0], color = 'r', label = 'Actin')
plt.plot(x, np.histogram(g.flatten(), bins = 25)[0], color = 'g', label = 'HSP27')
plt.plot(x, np.histogram(b.flatten(), bins = 25)[0], color = 'b', label = 'DAPI')
plt.legend()
Out[17]:
So, yes:
While we're on the topic of histograms, there is a convenient way to try and "reshape" the pixel histograms so as to make the resulting image a bit sharper. This is called histogram equalization.
The idea is simple enough: re-map the pixel values in the image so that the corresponding histogram is perfectly flat.
Basically it tries to fill in the "valleys" and flatten the "peaks" of the pixel histograms we saw earlier--this has the effect of bringing out very dim signal and dampening oversaturated signal.
Let's see an example, using one of the image channels.
In [18]:
from PIL import Image, ImageOps
img_pil = Image.open("Lecture22/image1.png")
beq = ImageOps.equalize(img_pil.split()[2])
f = plt.figure(figsize = (12, 6))
f.add_subplot(1, 2, 1)
plt.imshow(b, cmap = 'gray')
f.add_subplot(1, 2, 2)
plt.imshow(np.array(beq), cmap = 'gray')
Out[18]:
We can directly see why these two images look different (and, specifically, what histogram equalization did) by recomputing the channel histograms:
In [19]:
plt.plot(img_pil.split()[2].histogram(), 'b')
plt.plot(beq.histogram(), 'k')
Out[19]:
Autocontrast is another tool that modifies the pixel histograms to try and make the resulting images more viewable. In this case, the goal of autocontrast is to maximize (normalize) image contrast.
This function calculates a histogram of the input image, removes cutoff percent of the lightest and darkest pixels from the histogram, and remaps the image so that the darkest remaining pixel becomes black (0), and the lightest becomes white (255).
In essence, you choose some percentage cut-off (say: 50%, or 0.5), removes that fraction of pixels that are both darkest and lightest (assumes they're noise and throws them away), then remaps the remaining pixels.
Here's what it might look like:
In [20]:
bcon = ImageOps.autocontrast(img_pil.split()[2], 0.5)
f = plt.figure(figsize = (12, 6))
f.add_subplot(1, 2, 1)
plt.imshow(b, cmap = "gray")
f.add_subplot(1, 2, 2)
plt.imshow(np.array(bcon), cmap = "gray")
Out[20]:
In this case, we're trying to chop off pixel values at both ends of the histogram (lightest and darkest) and reshuffling the others around to make them more visible, hopefully improving contrast.
The effects on the underlying histograms look like:
In [21]:
plt.plot(img_pil.split()[2].histogram(), 'r')
plt.plot(bcon.histogram(), 'k')
Out[21]:
It closely mimics the original histogram, but because some values at the tails were thrown away, all the other values were reshuffled--you end up with more pixels some of the middle values, which is (presumably) the signal you're interested in.
Thresholding is the process by which you define a pixel threshold--say, the value 100--and set every pixel below that value to 0, and every pixel above that value to 255.
In doing so, you binarize the image, as each pixel takes on only one of two possible values.
Remember boolean indexing?
(head on over to slide 40 of Lecture 8--January 31--if you're a little fuzzy on the details)
In short, you can create masks based on certain boolean conditions so you can modify certain parts of the array while holding the others constant.
Here's the example straight from the lecture:
In [22]:
x = np.random.standard_normal(size = (7, 4))
print(x)
If we just want the positive numbers, we can define a mask using the condition you'd find in an if
statement:
In [23]:
mask = x < 0 # For every element of x, ask: is it < 0?
print(mask)
The mask is just a bunch of True
and False
values.
Now we can use the mask to modify the parts of the original array that correspond to True
in the mask:
In [24]:
x[mask] = 0.0
print(x)
Back to images! Let's use a threshold on our blue channel:
In [25]:
b_thresh = np.array(bcon) > 120 # Every pixel greater than 120 is "True", otherwise it's "False"
f = plt.figure(figsize = (12, 6))
f.add_subplot(1, 2, 1)
plt.imshow(np.array(bcon), cmap = "gray")
f.add_subplot(1, 2, 2)
plt.imshow(b_thresh, cmap = "gray")
Out[25]:
Any ideas how we might, say, count the number of cells?
There is an entire ecosystem of computer vision packages for Python.
Some are very general (a lot like scipy.ndimage
and PIL
) while some are very specific to certain classes of problems.
You could spend an entire career with just one or two of these packages, but very briefly I'll name a few of the most popular.
(We'll make use of some of them on Thursday)
If scipy.ndimage
or PIL
proves to be insufficient for your needs, this should be the first stop you take in looking for alternatives.
It has a wealth of general-purpose image processing routines built-in. It's actively developed and very easy to use, and integrates well with NumPy and SciPy.
It also comes with a bunch of basic tutorials and sample data to help you get your feet wet.
This is another excellent general-purpose image processing library, though it has a slight preference for bio-imaging applications. After all, its author is a computational biologist!
Like scikit-image
, it's actively developed, easy to use, and integrates fully with the NumPy + SciPy scientific computing environment for Python.
This is probably your first stop if you're looking for some basic bioimaging tools; we'll make use of it Thursday, most likely.
OpenCV (for "Open Computer Vision") is the Grand Daddy of image processing packages.
You'll want to use this if computer vision is a significant part of your day-to-day career. It's not for the faint of heart, however: it's a C++ library with Python bindings, which means you have to install from source, and that can be painful depending on how (un)comfortable you are with compiling things from scratch.
(though if you use the Anaconda distribution of Python, and you connect it to the conda-forge channel, you can download pre-built OpenCV packages that WAY SIMPLIFY this process)
That said, OpenCV has everything:
The list goes on and on.
It's well-maintained, well-documented, and while it can be a little tricky to use, it has a huge community of developers and users ready to help.
Like scikit-image
, it also provides a ton of tutorials for typical use-cases, though OpenCV's definition of "typical" is a little different: they're actually pretty in-depth!
Thursday's lecture--
Stay tuned!