Images and Image Plotting

Learning Objectives

  • Understand the concept of arrays as images
  • Load and display an image
  • Use array slicing operations to crop an image
  • Animate an image plot

Arrays as images

All photographic images represent a measurement of how much light hits the receiver. For instance, the Hubble image below is obtained by measuring the brightnesses of distant stars:

With traditional optical cameras, this measurement results in an image which is continuous, as it is projected directly onto paper. In order to store images digitally, they need to be divided into discrete chunks, pixels, each of which contains the value of the measurement in that small portion of the image. In this representation, an image is simply a grid of numbers, which allows it to be easily stored as an array with a shape equal to the resolution of the image.

The scikit-image (abbreviated to skimage in code) module contains some sample images in the data submodule that we can use to demonstrate this principle.


In [ ]:
# Get some import statements out of the way.
from __future__ import division, print_function
%matplotlib inline
import matplotlib.pyplot as plt
from skimage import data

In [ ]:
# Load the data.moon() image and print it
moon = data.moon()
print(moon)

Once read in as an array, the image can be processed in the same ways as any other array. For instance, we can easily find the highest, lowest and mean values of the image, the type of the variables stored in the array, and the resolution of the image:


In [ ]:
# Output the image minimum, mean and maximum.
print('Image min:', moon.min(), '; Image mean:', moon.mean(), '; Image max: ', moon.max())

# Output the array dtype.
print('Data type:', moon.dtype)

# Output image size.
print('Image size:', moon.shape)

This tells us that the image has a resolution of 512 x 512 pixels, and is stored as integers between 0 and 255. This is a common way of normalising images, but they can just as easily be stored as floats between 0 and 1. More commonly with astronomical data though, an image will consist of photon counts (i.e. integers), so the minimum will be 0 and any upper limit will likely be defined by the capabilities of the instrument.

Plotting images

While storing an image as a grid of numbers is very useful for analysis, we still need to be able to visually inspect the image. This can be achieved with plt.imshow(), which allocates a colour to every element in the array according to its value.


In [ ]:
# Display image array with imshow()
plt.imshow(moon)

When plotting an image in this way, you will often need to know what actual values correspond to the colours. To find this out, we can draw a colour bar alongside the image which indicates the mapping of values to colours:


In [ ]:
plt.imshow(moon)
plt.colorbar()

You may notice that the default mapping of values to colours doesn't show the features of this image very well. Fortunately, matplotlib provides a large variety of colour maps which are suitable for various different purposes (more on this later). plt.imshow() has a cmap keyword argument which can be passed a string defining the desired colour map.


In [ ]:
# Display the image with a better colour map.
plt.imshow(moon, cmap='gray')
plt.colorbar()

The full list of available colour maps (for matplotlib 1.5) can be found here.

Colour maps

As the images above demonstrate, the choice of colour map can make a significant difference to how your image appears, and is therefore extremely important. This is partly due to discrepancies between how quickly the colour map changes and how quickly the data changes, and partly due to the fact that [different people see colour differently](https://en.wikipedia.org/wiki/The_dress_%28viral_phenomenon%29).

In particular, matplotlib's default `'jet'` colour map is notoriously bad for displaying data. This is because it is not designed taking into account how the human eye percieves colour. This leads to some parts of the colour map appearing to change very slowly, while other parts of the colour map shift from one hue to another in a very short space. The practical effect of this is to both smooth some parts of the image, obscuring the data, and to create artificial features in the image where the data is smooth.

There is no single 'best' colour map - different colour maps display different kinds of image most clearly - but the `jet` map is almost never an appropriate choice for displaying any data. In general, colour maps which vary luminosity uniformly (such as the `'gray'` colour map above or the `'cubehelix'` colour map) tend to be better. Plots of various colour maps' luminosities can be found [here](http://matplotlib.org/users/colormaps.html).

For a good background on this topic and a description of a decent all-round colour map scheme, see [this paper](http://www.kennethmoreland.com/color-maps/ColorMapsExpanded.pdf).

Load and plot an image

  1. Try loading and plotting some other image arrays from `skimage.data`. Choose one of these images and print some basic information about the values it contains.
  2. Plot your chosen image with `imshow()`. Apply a colour map of your choice and display a colour bar.

In [ ]:
# 1

# Load an image from skimage.data
my_image = data.coins()
# Print image shape and size
print(my_image.shape, my_image.size)
# Print data type and min&max  of array
print(my_image.dtype, my_image.min(), my_image.max())

In [ ]:
# 2

# Display my image
plt.imshow(my_image, cmap='cubehelix')
plt.colorbar()

Value limits

The default behaviour of imshow() in terms of colour mapping is that the colours cover the full range of the data so that the lower end (blue, in the plots above) represents the smallest value in the array, and the higher end (red) represents the greatest value.

This is fine if rest of the values are fairly evenly spaced between these extremes. However, if we have a very low minimum or very high maximum compared to the rest of the image, this default scaling is unhelpful. To deal with this problem, imshow() allows you to set the minimum and maximum values used for the scaling with the vmin and vmax keywords.


In [ ]:
plt.imshow(moon, cmap='gray', vmin=75, vmax=150)
plt.colorbar()

As you can see, this allows us to increase the contrast of the image at the cost of discounting extreme values, or we can include a broader range of values but see less detail. Similar effects can also be achieved with the norm keyword, which allows you to set how imshow() scales values in order to map them to colours (linear or logarithmic scaling, for example).

Axes

You will notice in the above plots that the axes are labelled with the pixel coordinates of the image. You will also notice that the origin of the axes is in the top left corner rather than the bottom left. This is a convention in image-drawing, but can be changed if necessary by setting the origin keyword to 'lower' when calling imshow():


In [ ]:
plt.imshow(moon, cmap='gray', origin='lower')

imshow() also allows you to change the upper and lower values of each axis, and the appropriate tick labels will be drawn. This feature can be used to apply physical spatial scales to the image (if you know them) rather than going purely on pixel positions, which may be less useful. This is done with the extent keyword, which takes a list of values corresponding to lower and upper x values and the lower and upper y values (in that order).


In [ ]:
plt.imshow(moon, cmap='gray', origin='lower', extent=[-1, 1, 0, 2])

Value and axes limits

  1. Plot your chosen image again. Try changing the upper and lower limits of the plotted values to adjust how the image appears.
  2. Assume that each pixel of your image has some defined size (you decide a value - not unity). Adjust the axis limits accordingly so that the ticks correspond to physical distances rather than pixel values.

In [ ]:
# 1 

# Display the coins image with adjusted value range
plt.imshow(my_image, cmap='cubehelix', vmin=60, vmax=180)
plt.colorbar()

In [ ]:
# 2
pixelsize = 0.1
plt.imshow(my_image, cmap='cubehelix', vmin=60, vmax=180, extent=[0, my_image.shape[1]*pixelsize, 0, my_image.shape[0]*pixelsize])
plt.xlabel('x (cm)')
plt.ylabel('y (cm)')
plt.colorbar()

Loading an image from a file

The image used in the examples above uses an image which is already supplied as an array by scikit-image. But what if we have been given an image file and we want to read it into Python?

There are many ways to do this, depending on the type of file. Typically in astronomy, images are stored in FITS format, which will be covered in detail later on. For now, we will return to the example of the Hubble image from earlier, which is stored in this repo in fig/galaxy.jpg. To load image data from a JPEG, we need the plt.imread() function. This takes a filename which points at an image file and loads it into Python as a NumPy array.


In [ ]:
# Load the Hubble image from fig/galaxy.jpg
galaxy_image = plt.imread('fig/galaxy.jpg')
# PLot the image with imshow()
plt.imshow(galaxy_image)

You may notice that instead of using a colour map, this image has been plotted in full colour so it looks the same as the original image above. We can see why if we inspect the shape of the image array:


In [ ]:
galaxy_image.shape

Rather than just being a 2D array with a shape equivalent to the image resolution, the array has an extra dimension of length 3. This is because the image has been split into red, blue and green components, each of which are stored in a slice of the array. When given an n x m x 3 array like this, imshow() interprets it as an RGB image and combines the layers into a single image.

However, if we wish to see the individual components they can be accessed and displayed by taking a slice of the array corresponding to the layer we wish to use.


In [ ]:
plt.imshow(galaxy_image[..., 0], cmap='Reds') # Plot the red layer of the image
plt.show()
plt.imshow(galaxy_image[..., 1], cmap='Greens') # Plot the green layer of the image
plt.show()
plt.imshow(galaxy_image[..., 2], cmap='Blues') # Plot the blue layer of the image
plt.show()

plt.subplots()

As we've already seen, multiple axes can be added to a single figure using plt.add_subplot(). There is also a function that allows you to define several axes and their arrangement at the same time as the figure, plt.subplots().

This function returns a tuple of two objects - the figure and an array of axes objects with the specified shape. Referencing the axes array allows things to be plotted on the individual subplots.


In [ ]:
# Make a grid of 1 x 3 plots and show the Hubble image on the right.
fig, ax = plt.subplots(1, 3)
ax[2].imshow(galaxy_image)
plt.show()

Image components

  1. Create a 2x2 grid of plots using `plt.subplots()`. For either the Hubble image or another RGB image of your choice from `skimage.data`, plot the true colour image and each RGB component on one of these subplots.

In [ ]:
# 1

my_image = data.coffee()

# Create 2x2 grid of subplots
fig, axes = plt.subplots(2, 2)

# Plot image and image components with appropriate colour maps.
axes[0, 0].imshow(my_image)
axes[0, 1].imshow(my_image[..., 0], cmap='Reds')
axes[1, 0].imshow(my_image[..., 1], cmap='Greens')
axes[1, 1].imshow(my_image[..., 2], cmap='Blues')

Slicing images

We saw above that an RGB image array can be sliced to access one colour component. But the array can also be sliced in one or both of the image dimensions to crop the image. For instance, the smaller galaxy at the bottom of the image above occupies the space between about 200 and 400 pixels in the x direction, and stretches from about 240 pixels to the edge of the image in the y direction. This information allows us to slice the array appropriately:


In [ ]:
# Crop the image in x and y directions but keep all three colour components.
cropped_galaxy = galaxy_image[240:, 200:400, :]
plt.imshow(cropped_galaxy)

Similarly, if we need to reduce the image resolution for whatever reason, this can be done using array slicing operations.


In [ ]:
# Crop athe image and use only every other pixel in each direction to reduce the resolution.
lowres_galaxy = galaxy_image[240::2, 200:400:2, :]
plt.imshow(lowres_galaxy)

IMPORTANT NOTE: you should probably never do the above with actual astronomical data, because you're throwing away three quarters of your measurement. There are better ways to reduce image resolution which preserve much more of the data's integrity, and we will talk about these later. But it's useful to remember you can reduce an image's size like this, as long as you don't need that image for any actual science.

Interpolation

In order to display a smooth image, imshow() automatically interpolates to find what values should be displayed between the given data points. The default interpolation scheme is 'linear', which interpolates linearly between points, as you might expect. The interpolation can be changed with yet another keyword in imshow(). Here are a few examples:


In [ ]:
# Image with default interpolation
fig, ax = plt.subplots(2, 2, figsize=(16, 16))
smallim = galaxy_image[:100, 250:350, :]
ax[0, 0].imshow(smallim) # Default (linear) interpolation
ax[0, 1].imshow(smallim, interpolation='bicubic') # Bicubic interpolation
ax[1, 0].imshow(smallim, interpolation='nearest') # Nearest-neighbour interpolation
ax[1, 1].imshow(smallim, interpolation='none') # No interpolation

This can be a useful way to change how the image appears. For instance, if the exact values of the data are extremely important, little or no interpolation may be appropriate so the original values are easier to discern, whereas a high level of interpolation can be used if the smoothness of the image is more important than the actual numbers.

Note that that 'none' in the imshow() call above is NOT the same as None. None tells imshow() you are not passing it a variable for the interpolation keyword, so it uses the default, whereas 'none' explicitly tells it not to interpolate.

Animation

We have already seen animation of data points on basic plots in a previous lesson. Animating an image is no different in principle. To demonstrate this, we'll set up an animation that shows the Hubble image and then cycles through each of the RGB components. This task requires all the same parts as an animation of a line or scatter plot:

  • First, we'll need matplotlib.animation, a figure and an axes. Then we'll plot the initial image we want to display and return the plot object to a variable we can use for the animation.
  • Now we need to define the function that will adjust the image. This function, like the ones we used for line plots, needs to take as input an integer which counts the number of 'frames', adjust the displayed data and return the adjusted object.
  • Then we can define the animation object and plot it to see the finished product.

In [ ]:
import matplotlib.animation as ani
# We'll need this for displaying animations
%matplotlib nbagg

fig, ax = plt.subplots()
display = plt.imshow(galaxy_image)

titles = ['Red component', 'Green component', 'Blue component', 'Combined image']
cmaps = ['Reds_r', 'Greens_r', 'Blues_r']

def animate(i):
    try:
        display.set_data(galaxy_image[..., i])
        display.set_cmap(cmaps[i])
    except IndexError:
        display.set_data(galaxy_image)
    ax.set_title(titles[i])
    return display

myanim = ani.FuncAnimation(fig, animate, range(4), interval=1000)
plt.show()

Moving around an image

  1. Plot a small portion at one end of your chosen image. Then animate this plot so that it pans across to the other side of the image.

In [ ]:
# 1
fig, ax = plt.subplots()
y0 = 0
y_ext = 120

display = plt.imshow(galaxy_image[y0:y0+y_ext, 200:400])

def pan(i):
    y1 = y0 + i
    display.set_data(galaxy_image[y1:y1+y_ext, 200:400])
    return display

panimation = ani.FuncAnimation(fig, pan, range(galaxy_image.shape[0]-y_ext), interval=10)
plt.show()

FITS files

A type of image file that you are quite likely to come across in astronomy is FITS (Flexible Image Transport System) files. This file type is used for storing various types of astronomical image data, including solar images. The advantage of FITS files is that as well as storing the numerical data which makes up the image, they also store a header associated with these data. The header usually contains information such as the spatial extent of the image, the resolution, the time at which the observation was taken, and various other properties of the data which may be useful when using the image for research. These pairs of data arrays and associated headers are stored in a HDU (Header-Data Unit). Several HDUs can be stored in a FITS file, so they are kept in a container called HDUList.


In [ ]:
from astropy.io import fits
import sunpy.data

#sunpy.data.download_sample_data()

aia_file = fits.open('/home/drew/sunpy/data/sample_data/aia.lev1.193A_2013-09-21T16_00_06.84Z.image_lev1.fits')
aia_file.verify('fix')

print(type(aia_file))
print(aia_file)

print(type(aia_file[1].data), type(aia_file[1].header))
print(aia_file[1].data)
print(aia_file[1].header['NAXIS1'])
print(aia_file[1].header['NAXIS2'])
print(aia_file[1].header['DATE-OBS'])
for tag in aia_file[1].header.keys():
    print(tag, aia_file[1].header[tag])