Images are just arrays of data, where the data tells us the colors in the image. It will get a little more complicated than this, as we'll see below, but this is the general idea. Since colors are typically represented by three dimensions, image arrays are typically [M x N x 3], and sometimes [M x N x 4], where the final entry of the last dimension contains the alpha or transparency value.
In [4]:
import requests # from webscraping
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt
import matplotlib
import cmocean
import cartopy
from PIL import Image # this is the pillow package
from skimage import color
from scipy import ndimage
from io import BytesIO
We'll use an image of Grace Hopper for our sample image. Grace was one of the first computer programmers, invented the first computer compiler, and was a US Navy Rear Admiral. She's so important that matplotlib
contains a picture of her!
In [5]:
hoppermpl = plt.imread(matplotlib.cbook.get_sample_data("grace_hopper.png"))
What does hoppermpl
look like and contain?
In [6]:
print(hoppermpl.shape, type(hoppermpl))
hoppermpl
Out[6]:
... just a bunch of numbers in an array with shape [M x N x 3].
In [7]:
hopperpil = Image.open(matplotlib.cbook.get_sample_data("grace_hopper.png"))
What does hopperpil
look like and contain?
In [8]:
print(type(hopperpil))
hopperpil
Out[8]:
The PIL PngImageFile object defaults to a convenient view of the picture itself.
We have a sneak peak at the photo of Grace Hopper from the PIL Image object, but we'll also want to be able to plot the image other ways in which we have more control. Generally for plotting, we'll want to have the data in the form of an array, though there are other options using the PIL package and a PIL object directly.
Let's try the way we've been plotting a lot of our data: pcolormesh
or contourf
:
In [9]:
fig = plt.figure()
ax = fig.add_subplot(111)
ax.pcolormesh(hoppermpl)
Why didn't that work?
When we've used pcolor
or contourf
in the past, we've always used a 2D array of data (or a single slice of a 3d array). However, this data is 3D due to having red, green, and blue values. Thus, there are too many dimensions to plot it this way.
Instead, we need to use special image-based functions to plot RGB data, for example, imshow
:
In [ ]:
fig = plt.figure(figsize=(14, 14))
ax1 = fig.add_subplot(1, 2, 1)
ax1.imshow(hoppermpl)
ax1.set_title('data via matplotlib')
# Get an array of data from PIL object
hopperpilarr = np.asarray(hopperpil)
ax2 = fig.add_subplot(1, 2, 2)
ax2.imshow(hopperpilarr)
ax2.set_title('data via PIL')
Notice that the x-axis 0 value is, as usual, at the left side of the figure. However, the y-axis 0 value is at the top of the figure instead of the typical bottom. This makes the origin for the coordinate axes at the top left instead of the bottom left. This is the convention for image data.
In RGB, colorspace is represented as a cube of values from 0 to 1 (or 0 to 255 or 1 to 256, depending on the specific algorithm) for each of red, green, and blue, which, when combined, represent many colors. The Hopper images are currently in RGB. However, RGB is but one representation of color. We could, instead, represent color by its hue, saturation, and value (HSV), where hue is a circular property from red to yellow to blue and back to red, saturation is the vividness of the color, and value or brightness goes from black to white. And there are many others.
There are at least a handful of Python packages out there you can use to convert color triplets between colorspaces, including colorspacious
which has more options, but we'll use scikit-image
.
In [ ]:
hopperhsv = color.convert_colorspace(hoppermpl, "RGB", "HSV")
hopperhsv
In [ ]:
plt.plot(hoppermpl[:,:,0], hopperhsv[:,:,0], '.k');
So the HSV representation is still an array of numbers of the same shape, but they are for sure different: if they were the same, plotting them against each other would give a 1-1 correspondence.
An image can be represented by shades of gray instead of in 3D colorspace; when you convert to grayscale from 3D colospace, you inherently discard information. There are many ways of doing this transformation (this link is a great resource).
How might we convert to grayscale? We have RGB information, which is more than we need. What if we just take one channel?
In [ ]:
fig = plt.figure()
ax = fig.add_subplot(111)
ax.pcolormesh(hoppermpl[:,:,0])
What is wrong here? For one thing, she is upside down. Another is that she is still colored though didn't we just eliminate all but one color channel?
We can fix the flip in plotting by either flipping the axes by hand or by using a function that is meant to plot image data, like matshow
.
In [ ]:
fig = plt.figure()
ax = fig.add_subplot(111)
ax.matshow(hoppermpl[:,:,0])
Grace is being colored by the default colormap, giving her a strange look. Let's choose the grayscale colormap to match our expectations in what we're doing here.
In [ ]:
fig = plt.figure()
ax = fig.add_subplot(111)
ax.matshow(hoppermpl[:,:,0], cmap='gray')
In [ ]:
In [ ]:
We can also just use a built-in function for conversion to grayscale, such as from scikit-image
:
In [ ]:
hoppergray = color.rgb2gray(hoppermpl)
print(hoppergray.shape)
fig = plt.figure()
ax = fig.add_subplot(111)
ax.matshow(hoppergray, cmap='gray')
png
filesImage file format png
is worth a specific discussion due to its use in applications like satellite data. The pixel format of the pixels in a png
file can have different numbers of dimensions, representing different things. We'll focus on two cases here: the [M x N x 3] and [M x N] cases.
Returning to our web scraping example using satellite data, we find that different types of satellite data products have differently-sized arrays. Note that when you go to the website and examine the information associated with various satellite products, you get hints about how many channels of data it should contain.
First we examine an RGB composite image. The (edited) note associated with this data on the website is as follows:
RGB: Red-Green-Blue composite image showing clouds, ocean, and land. The resulting reflectance in the three MODIS bands (645 nm: R; 555 nm: G; 469 nm: B) is stretched to 0-255 to obtain the RGB image.
This turns out to be pretty straight-forward to plot if we just treat the data we've read in as an image:
In [ ]:
# RGB
image_loc = 'http://optics.marine.usf.edu/subscription/modis/GCOOS/2016/daily/091/A20160911855.1KM.GCOOS.PASS.L3D_RRC.RGB.png'
response = requests.get(image_loc) # choose one of the files to show as an example
img = Image.open(BytesIO(response.content))
rgb = np.asarray(img)
print(rgb.shape)
plt.imshow(rgb)
Next we examine a sea surface temperature (SST) image. Here is the edited data note from the site:
SST: Sea Surface Temperature (in Degree C) estimated using the SeaDAS processing software (default product) with a multi-channel non-linear regression algorithm (Brown and Minnett, 1999). The MODIS standard product MOD35 (Ackerman et al., 2010) is used to discriminate clouds from water, and a cloudmask (grey color) is overlaid on the image.
What is this telling us? The data in the image is not represented in three channels like in the previous example, but in a single channel or index. It looks like it is represented in 3D colorspace, but really what we are seeing is a single channel of data being mapped using a colormap, just like in any of our typical data plots using pcolormesh
, etc. This means that we are working to access the data points themselves, which we will then want to plot with our own colormap for representation.
In [ ]:
# SST
image_loc = 'http://optics.marine.usf.edu/subscription/modis/GCOOS/2016/daily/091/A20160911855.1KM.GCOOS.PASS.L3D.SST.png'
response = requests.get(image_loc) # choose one of the files to show as an example
img = Image.open(BytesIO(response.content))
index = np.asarray(img)
print(index.shape)
plt.matshow(index)
This has shape [M x N] instead of [M x N x 3], so we have used matshow
instead of imshow
to plot it. Still, the plot doesn't look very good, does it? The land has been colored as red, which is taking up part of our 0-255 data range. Let's examine this further with a histogram of the values in the data set.
In [ ]:
n, bins, patches = plt.hist(index.flatten(), range=[0,255], bins=256) # use 256 bins, one for each color representation in the data.
We see a suspicious pattern in the data: there is a reasonable-looking spread of data in the lower part of the available bins, then nothing, then some big peaks with high, singular values (without a spread). This is telling us that the data itself is in the lower part of the representation range, and other parts of the image are represented with reserved larger values.
The histogram values give us a strong clue about this. We can also directly examine the colormap used in this data to figure out the range of data. The PIL
function getpalette
tells us this information as a list of RGB values:
In [ ]:
img.getpalette()
We need to change this information into a colormap. To do so, we need an [N x 3] array of the colormap values, where N is probably going to be 256 but doesn't have to be. Then we convert this into a colormap object.
In [ ]:
# the -1 in reshape lets that dimension be what it needs to be
palette = np.asarray(img.getpalette()).reshape(-1, 3) # change list to array, then reshape into [Nx3]
palette.shape
In [ ]:
cmap = cmocean.tools.cmap(palette) # Create a colormap object
In [ ]:
plt.matshow(index, cmap=cmap, vmin=0, vmax=255) # use the colormap object
plt.colorbar()
So where exactly is the cut off for the range of data values? Here we examine the colormap values:
In [ ]:
plt.plot(palette)
# plt.gca().set_xlim(230, 250)
Looks like the highest data value is 235, so everything above that can be masked out.
We want the appropriate x and y coordinates to go with our image. There is information about this on the data page:
The Gulf of Mexico Coastal Ocean Observing System region is an area bounded within these coordinates: 31°N 18°N 79°W and 98°W.
...
All images are mapped to a cylindrical equidistant projection. Images are at 1 kilometer resolution.
A cylindrical equidistant projection is just lon/lat.
In [ ]:
lon = np.linspace(-98, -79, index.shape[1]) # know the number of longitudes must match corresponding number in image array
lat = np.linspace(18, 31, index.shape[0])
lat = lat[::-1] # flipping it makes plotting later work immediately
In [ ]:
Make a new colormap instance that includes only the data range and not the masking values (since
palette
also contains color information for the land):
In [ ]:
Plot the satellite data. What should the range of data be? Be sure to show the colorbar to check your work.
In [ ]:
How about a good colormap to finish off the plot?
In [ ]:
Ok. So we have a plot with a reasonable range for the data and the image looks pretty good. What do these values represent, though? The color index
probably doesn't actually have values from datamin
to datamax
. Rather, we have to determine the range of the data that was used in the originally plotted colormap and transform the values to span the correct range.
How do we do this? To start, we need to know the colorbar min and max that were used in the original image. It turns out that while this information is not on the png, it is on the google earth representation. Here is a direct link to that data page so we can click around.
In [ ]:
In [ ]:
image_loc = 'https://upload.wikimedia.org/wikipedia/commons/c/c4/PM5544_with_non-PAL_signals.png'
response = requests.get(image_loc)
img = Image.open(BytesIO(response.content)) # using PIL
index = np.asarray(img)
plt.imshow(index)
In [ ]:
findex = ndimage.gaussian_filter(index, 2.0) # filters in all 'three' dimensions, including channel...
plt.imshow(findex) # ...probably not what we want.
Filtering without paying attention to the dimensions of the array altered the colors of the image. But, if we instead filter in space for each channel individually:
In [ ]:
sigma = 2.0 # Standard deviation of the gaussian kernel. Bigger sigma == more smoothing.
findex = np.zeros_like(index)
for channel in range(3):
findex[:, :, channel] = ndimage.gaussian_filter(index[:, :, channel], sigma=sigma)
plt.imshow(findex)
In [ ]:
gsindex = index.sum(axis=-1)
fig = plt.figure(figsize=(7.68, 5.76), dpi=100)
ax = fig.add_axes([0, 0, 1, 1])
ax.axis('off')
plt.imshow(gsindex, cmap='gray')
We use a Sobel filter (Sobel Operator) to quickly search calculate gradients in the image array.
In [ ]:
# FINDING GRADIENTS
from scipy.ndimage import sobel, generic_gradient_magnitude
d_gsindex = ndimage.generic_gradient_magnitude(gsindex, sobel)
# Note screen resolution is about 100dpi, so lets make sure the image is big enough to see all the points.
fig = plt.figure(figsize=(7.68, 5.76))
ax = fig.add_axes([0, 0, 1, 1])
ax.axis('off')
ax.matshow(d_gsindex, cmap='gray')
In [ ]:
# INTERPOLATION/MAPPING
x = 768*np.random.rand(50000)
y = 578*np.random.rand(50000)
xy = np.vstack((y, x))
z = ndimage.map_coordinates(gsindex, xy)
plt.scatter(x, y, 10, z, edgecolor='none')
In [ ]:
# ROTATING
rgsindex = ndimage.rotate(gsindex, 15, mode='wrap')
fig = plt.figure(figsize=(7.68, 5.76), dpi=100)
ax = fig.add_axes([0, 0, 1, 1])
ax.axis('off')
plt.imshow(rgsindex, cmap='gray')
# Note, the image size increased to accomodate the rotation.
print(rgsindex.shape, gsindex.shape)
Try some other rotations. Does the increase in image size make sense based on trigenometry? What happens with a 90deg rotation?
Look at the documentation, and try different modes. What's the difference between 'constant' and 'wrap'?
Try rotating back and forth 15 degrees at least 10 times, using various modes (and be be sure to set 'reshape=False' to prevent the image from growing over the iterations).
In [ ]:
im = np.zeros((128, 128))
im[32:-32, 32:-32] = 1
im = ndimage.rotate(im, 15, mode='constant')
im = ndimage.gaussian_filter(im, 4)
im += 0.2 * np.random.random(im.shape)
plt.imshow(im, cmap='viridis')
Now, try to detect the edges of this feature using the scikit image
canny edge detection algorithm:
In [ ]:
from skimage import feature
edges = feature.canny(im, sigma=1) # sigma=1 is the default
plt.imshow(edges, cmap='viridis')
Now let's find the convex hull of the edges that we detected (hopefully only the single square now):
In [ ]:
from skimage.morphology import convex_hull_image
chull = convex_hull_image(edges)
plt.imshow(chull, cmap='viridis')
This would work even for an 'open' object,
In [ ]:
diag_mask = np.triu(np.ones(im.shape))
edges = edges.astype('float') * diag_mask
chull = convex_hull_image(edges)
fig, axs = plt.subplots(1, 2)
axs[0].imshow(edges, cmap='viridis')
axs[1].imshow(chull, cmap='viridis')
In [ ]:
from skimage import data
from skimage.feature import blob_dog, blob_log, blob_doh
from skimage.color import rgb2gray
image = data.hubble_deep_field()[0:500, 0:500]
image_gray = rgb2gray(image)
plt.imshow(image_gray, cmap='gray')
Here is the Laplacian of Gaussian method as an example. How many galaxies are found depends on the threshold
parameter especially.
In [ ]:
blobs_log = blob_log(image_gray, max_sigma=30, num_sigma=10, threshold=.4)
# the data are x, y, sigma for all the blobs. Lets make a quick plot.
y = blobs_log[:, 0]
x = blobs_log[:, 1]
sigma = blobs_log[:, 2]
# Calculate the radius of the blob from sigma, which is given in the docs as:
r = sigma*np.sqrt(2)
# represent marker size with r^2 to approximate area, and use log10(r) to give a spread in colors
plt.scatter(x, -y, r**2, np.log10(r), cmap='viridis', edgecolor='none')
plt.colorbar()
plt.axis('tight')
plt.gca().set_aspect(1.0)
Here we show the three algorithms. The Laplacian of Gaussian (LoG) is the most accurate and slowest approach. The Difference of Gaussian (DoG) is a faster approximation of LoG approach. The Determinant of Hessian is the fastest approach but is not accurate for small blobs. More details are available online.
In [ ]:
blobs_dog = blob_dog(image_gray, max_sigma=30, threshold=.1)
# For this method, the radius is again found by multiplying by sqrt(2)
blobs_dog[:, 2] = blobs_dog[:, 2] * np.sqrt(2)
# for this method, the third output parameter is already approximately the blob radius
blobs_doh = blob_doh(image_gray, max_sigma=30, threshold=.01)
blobs_list = [blobs_log, blobs_dog, blobs_doh]
colors = ['yellow', 'lime', 'red']
titles = ['Laplacian of Gaussian', 'Difference of Gaussian',
'Determinant of Hessian']
fig, axes = plt.subplots(1, 3, figsize=(14, 4), sharex=True, sharey=True)
plt.tight_layout()
# loop over the parameters in `zip` simultaneously
for blobs, color, title, ax in zip(blobs_list, colors, titles, axes):
ax.set_title(title)
# Show blogs
ax.imshow(image, interpolation='nearest')
ax.set_axis_off()
# plot detected blobs over blobs to compare
for blob in blobs:
y, x, r = blob
c = plt.Circle((x, y), r, color=color, linewidth=2, fill=False)
ax.add_patch(c)
plt.show()