Graphics and Visualisation

I recommend you follow along by typing the code snippets into your own file.

We will make use of the pylab part of the matplotlib package. The matplotlib library is a powerful tool capable of producing complex publication-quality figures with fine layout control in two and three dimensions; in this session we will provide a minimal self-contained introduction to its usage that covers the basic functionality. I encourage you to read the tutorials included with the matplotlib documentation as well as to browse its extensive gallery of examples that include source code.

We will use the shorthand plt for the matplotlib.pyplot module where the easy-to-use plotting functions reside. Similarly we'll use np as shorthand for numpy.

Lab Session Contents:

  1. Simple 1D graphs
  2. 2D graphs and images
  3. 3D graphs
  4. Animations

There are more examples of all of these in Chapter 3 of Newman's Python for Physicists book.

For more on matplotlib, including tutorials visit matplotlib.org.

Specific matplotlib pages which you may find the most useful:


In [1]:
%matplotlib inline

Graphs

A basic plot is simple. You need to import the plotting packages, define an array to plot, plot it and then show it (try it with and without the show() to see the difference).

The x-array below is just the index numbers of the y-array.


In [2]:
from pylab import plot,show
y = [ 1.0, 2.4, 1.7, 0.3, 0.6, 1.8 ]
plot(y)
show()


Another way of importing the packages is demoed below:


In [3]:
import matplotlib.pyplot as plt

plt.plot(y)
plt.show()


For a more complicated plot, we will make an x-array (making use of the linspace command from numpy), calculate it's sine and cosine and plot both on the same plot.


In [5]:
import matplotlib.pyplot as plt
from numpy import linspace,sin,cos

x = linspace(0,10,100)
y = sin(x)
z = cos(x)
plt.plot(x,y)
plt.plot(x,z)
plt.show()


You should never include a plot like this in a report. What's wrong with it?

Adding Axis Labels, Titles, Legends

Axis labels use of xlabel and ylabel as follows:

If you want a plot title, use plt.title.

For a legend, label the different lines in plot, and then use plt.legend.

The labels accept LaTeX formatting for maths. We will cover LaTeX in the next lab session.

You might want a grid, which can be added using plt.grid().


In [6]:
#Trend labels
plt.plot(x, y, label=r'$\sin(x)$')
plt.plot(x, z, label=r'$\cos(x)$')

#Title
plt.title('Trigonometic Functions')

#Axis labels
plt.xlabel("$x$ axis")
plt.ylabel("$y = \sin x$  or  y = cos x")

#Add a grid
plt.grid()

#Add a legend box
plt.legend()

#Actually show the plot
show()


If you want to improve the layout further you might want a little white space above and below the curves. Do this with ylim.


In [7]:
plt.plot(x, y, label=r'$\sin(x)$')
plt.plot(x, z, label=r'$\cos(x)$')

#Adding white space defining the range of the plot
plt.ylim(-1.1,1.1)

plt.title('Trigonometic Functions')
plt.xlabel("$x$ axis")
plt.ylabel("$y = \sin x$  or  $y = \cos x$")
plt.grid()
plt.legend()
show()


Line Formatting

You might want the lines to be different colours from the default. Do this by adding a formating string to the plot command.

It will have this form:

plot(x,y,"CS")

where C is a character which tells the colour. Here are some common options:

  • r = red
  • g = green
  • b = blue
  • c = cyan
  • m = magenta
  • y = yellow
  • k = black
  • w = white

S will be a string defining the line style. For example:

  • - = solid line
  • -- = dashed line
  • o = circular points
  • s = square points
  • . = small dots

For example:


In [8]:
#Tell it what to plot, what colour and linestyle and label them
plt.plot(x, y, 'r-', label=r'$\sin(x)$')
plt.plot(x, z, 'm--', label=r'$\cos(x)$')

#Plot range
plt.ylim(-1.1,1.1)

#Plot title and axis labels, grid and legend
plt.title('Trigonometic Functions')
plt.xlabel("$x$ axis")
plt.ylabel("$y = \sin x$  or  $y = \cos x$")
plt.grid()
plt.legend()

#Show the plot
show()


Hopefully this gives you a sense of the possible formatting options.

There are many more options than we can go through in lab.

Visit matplotlib.org for the full documentation.

Other places to find example plots:

  • StackOverflow (this is the python subsection)
  • Github - and search for what you want to do, someone may have done it already.

Plotting Data from a File

In python is is very easy to load files of data and plot them. The numpy package has modules which can read different file formats. Make a simple text file with two columns of data. E.g.

0 12121.71 1 12136.44 2 12226.73 3 12221.93 4 12194.13 5 12283.85 6 12331.6 7 12309.25

Call this file values.txt, and make sure it's in the same directory as your code.


In [10]:
import numpy as np

#Use the numpy package loadtxt to inport data from a file
data = np.loadtxt("values.txt",float)

#Tell python which bit is x and which bit is y
x = data[:,0]
y = data[:,1]

#Plot these data
plt.plot(x,y,'ko')
plt.xlabel("x axis")
plt.ylabel("Y axis")
show()


Adding Error Bars

In many experimental applications you will need to plot error bars on your points. We can do this using the syntax

plt.errorbar(x,y,xerr=A,yerr=B)

For example for a fixed size error bar:


In [11]:
data = np.loadtxt("values.txt",float)
x = data[:,0]
y = data[:,1]
plt.plot(x,y,'ko')

#Add error bars
plt.errorbar(x, y, xerr=0.2, yerr=40)

plt.title("Simplest errorbars, 0.2 in x, 40 in y");
plt.xlabel("x axis")
plt.ylabel("Y axis")
plt.show()


It is also possible to add variable error bars as follows.


In [12]:
# example variable error bar values
yerr = 0.01*y
xerr = 0.1*np.sqrt(x)

data = np.loadtxt("values.txt",float)
x = data[:,0]
y = data[:,1]
plt.plot(x,y,'ko')
plt.errorbar(x, y, xerr=xerr, yerr=yerr)
plt.title("Variable error bars, 1% in y, and 10% of Poisson Counting errors in x");
plt.xlabel("x axis")
plt.ylabel("Y axis")
plt.show()


In Class Exercise

We will now practice these plotting skills by making various plots from data in the file sunspots.txt (on Moodle). This file contains sunspot number each month since January 1749. You will need to convert the x values into a more useful number than months since Jan 1749.

  1. First plot the number of sunspots as a function of time, expressed in years. Make sure it is properly labelled, use ylim to customize the axes range, and make your own selection of line colour and style.

  2. Now plot only the data in the last 20 years of the file (1991-2011).

  3. Now copy the above plot and add Poisson counting error bars to the y-values. (Poisson counting error bars are simply +/-sqrt N).

  4. Now copy the above and overplot a sine wave. Adjust it to make a good visual match with the data. You will need to adjust the amplitude, x and y axis zero points and period to match. E.g.

y = A + B*sin(C/pi(x-D))

[where A = where the zero line should be, B = amplitude around A, C = the period of the oscillation in years, and D = the zero point in years]. Hint the period of the solar cycle is about 11 years.

Other Types of Plots (Histograms and log spacing)

First a histogram.

This example also demos a different way to set axis ranges (using plt.axis), and how to write text on the plot.


In [13]:
#Let's generate a Gaussian with mean 100, standard deviation 15 using the random function
mu, sigma = 100, 15
x = mu + sigma * np.random.randn(10000)

# the histogram of the data
n, bins, patches = plt.hist(x, 50, normed=1, facecolor='m', alpha=0.75)

#alpha is a parameter which sets the saturation of the colour
#normed normalises the area to 1.  

plt.xlabel('x axis')
plt.ylabel('Probability')
plt.title('Histogram of a Gaussian Distribution')

# This will put a text fragment at the position given:
plt.text(55, .027, r'$\mu=100,\ \sigma=15$', fontsize=14)
plt.axis([40, 160, 0, 0.03])
plt.grid(True)


Finally a log plot. These are frequently used in astronomy where the dynamic range of a plot can be very large. Let's plot first on a normal linear scale.


In [14]:
x = np.linspace(0, 10, 100)
y = np.exp(-x)
plt.plot(x, y);



In [15]:
plt.semilogy(x, y);


Customising your Plots

Problems with plots in papers/reports (not just students, academics do a lot of these too):

  • font size is small
  • font type is ugly
  • tick size is small
  • lines are too thin
  • colours unsuitable for people who are colour blind
  • legend formatting is ugly
  • ...

(you can read entire books on this stuff, but this is a Physics unit so I'll stop there).

The easiest way to make consistent plots is to define a new matplotlib style in its own .py file and import it every time you want to plot some data. For an example style file see mpl_style.py (Credit: Coleman Krawcyzk) that increases line widths, increases fonts sizes, uses a better color cycle, formats ticks and axes larger, and sets to default colormap to something reasonable (magma, plasma, inferno, and viridis are all color blind friendly and convert to black and white without issue).


In [16]:
import mpl_style
plt.style.use(mpl_style.style1)

Let's try the histogram again:


In [17]:
#Let's generate a Gaussian with mean 100, standard deviation 15 using the random function
mu, sigma = 100, 15
x = mu + sigma * np.random.randn(10000)

# the histogram of the data
n, bins, patches = plt.hist(x, 50, normed=1, facecolor='m', alpha=0.75)

#alpha is a parameter which sets the saturation of the colour
#normed normalises the area to 1.  

plt.xlabel('x axis')
plt.ylabel('Probability')
plt.title('Histogram of a Gaussian Distribution')

# This will put a text fragment at the position given:
plt.text(55, .027, r'$\mu=100,\ \sigma=15$', fontsize=14)
plt.axis([40, 160, 0, 0.03])
plt.grid(True)


In Class Exercise

Make a histogram of the sunspot number on the Sun from the file.

Make sure to customize the graphics using the style file.

2D Plots

There are many applications where plotting a 2D array will be useful.

A basic 2D plot is very easy. It makes use of the matplotlib module, imshow.

cmap defines the colour map. Here are some options:

  • Default (ie. if nothing specified): red-blue heat map (jet)
  • Blues (note Blues_r or the equivalent will reserve the default direction)
  • Greys - greyscale
  • hot - heat map that goes black-red-yellow-white
  • spectral - spectrum with 7 clearly defined colors, plus black and white
  • bone - alternative gray-scale with a hint of blue
  • hsv - rainbow scheme that starts and ends with red
  • magma - white to black via reds/yellows (like hot magma)
  • plasma - white to blue/purple via yellow
  • inferno - very similar to magma
  • viridis - yellow to blue via green

Try the below plots out with various colourmaps.


In [19]:
plt.imshow(np.random.rand(100,100), interpolation='nearest',cmap='plasma');


The 2D array can be read in from a file of course. For example, making use of the default colourmap (need to grab the file circular.txt for this to work).


In [20]:
data = np.loadtxt("circular.txt",float)
plt.imshow(data,origin="lower",cmap='magma')


Out[20]:
<matplotlib.image.AxesImage at 0x7fe745145128>

If you use a colour scale to indicate a physical quantity you will want to include a colourbar. This is easy using colorbar()

There is an art to selecting the right colour map to display data/visualisations. You may also wish to consider how plots look printed by a B&W printer, or viewed by people with common types of colour blindness.


In [21]:
plt.imshow(data,origin="lower",cmap='viridis')
plt.colorbar()


Out[21]:
<matplotlib.colorbar.Colorbar at 0x7fe74594b470>

Images

Python makes it easy to include images in your plots, and even manipulate them. You can use the Soyuz_launch.png images provided on Moodle or any appropriate image (.png) of your choice.


In [23]:
img = plt.imread('Soyuz_launch.png')
plt.imshow(img)
plt.show()


We can also explore just single layers of this image, which is constructed from a digital camera from RBG (Red,Blue,Green in that order) filters in the digital camera. For example:


In [28]:
#Red image
plt.imshow(img[:,:,0], cmap="Reds_r")
plt.show()



In [29]:
#Blue image
plt.imshow(img[:,:,1], cmap="Blues_r")
plt.show()



In [30]:
#Green image
plt.imshow(img[:,:,2], cmap="Greens_r")
plt.show()


Subplots

It takes a lot of space to make all those plots separately, but we can make subplots as follows.


In [43]:
fig, ax = plt.subplots(1,4, figsize=(10,6))
ax[0].imshow(img[:,:,0], cmap="Reds_r")
ax[1].imshow(img[:,:,1], cmap="Blues_r")
ax[2].imshow(img[:,:,2], cmap="Greens_r")
ax[3].imshow(img);
for a in ax:
    a.set_xticklabels([])
    a.set_yticklabels([])
plt.show()



In [35]:
redimg=img[:,:,0]
plt.hist(redimg.ravel(), bins=256, range=(0.0, 1.0), fc='k', ec='k')


Out[35]:
(array([     0.,      0.,      0.,  19676.,   9220.,   4785.,   9664.,
          5085.,   4993.,   4834.,   4935.,   4818.,   4816.,   4605.,
          4486.,   4435.,   4276.,   4389.,   4281.,   4521.,   4537.,
          4567.,   4656.,   4825.,   4784.,   5028.,   4976.,   5028.,
          4960.,   5044.,   4915.,   4832.,   4637.,   4474.,   4444.,
          4620.,   4469.,   4411.,   4365.,   4230.,   4061.,   4076.,
          3967.,   3830.,   3827.,   3739.,   3715.,   3749.,   3581.,
          3781.,   3899.,   3905.,   4182.,   4204.,   4427.,   4622.,
          4725.,   4600.,   4688.,   4588.,   4479.,   4236.,   4329.,
          4298.,   4390.,   4614.,   4668.,   5016.,   5513.,   6406.,
          8118.,  11216.,  16270.,  19559.,  23425.,  28308.,  29498.,
         31641.,  30166.,  28642.,  28625.,  28197.,  26768.,  26460.,
         26656.,  28608.,  28353.,  28281.,  27726.,  24934.,  22339.,
         21031.,  22645.,  25197.,  29176.,  33670.,  34549.,  32983.,
         28258.,  25649.,  22686.,  20294.,  17834.,  17830.,  19284.,
         21950.,  22241.,  23369.,  25591.,  29316.,  33407.,  37296.,
         37249.,  36558.,  31279.,  26539.,  24837.,  23118.,  21650.,
         20077.,  19432.,  19029.,  17887.,  18170.,  19081.,  18908.,
         17539.,  16746.,  17959.,  19048.,  19235.,  19269.,  18086.,
         17245.,  19021.,  21947.,  23401.,  26065.,  29301.,  30435.,
         29389.,  27397.,  26922.,  27786.,  30187.,  31722.,  32871.,
         32475.,  29801.,  27317.,  24747.,  21764.,  19362.,  16556.,
         14397.,  10207.,   7062.,   5299.,   4742.,   3904.,   3392.,
          3023.,   2901.,   2780.,   2674.,   2539.,   2499.,   2386.,
          2331.,   2359.,   2386.,   2287.,   2313.,   2274.,   2162.,
          2170.,   2118.,   2122.,   2154.,   2100.,   1992.,   1888.,
          1897.,   1950.,   1851.,   1872.,   1735.,   1753.,   1715.,
          1648.,   1535.,   1502.,   1513.,   1572.,   1523.,   1566.,
          1498.,   1599.,   1610.,   1560.,   1567.,   1635.,   1612.,
          1587.,   1566.,   1437.,   1495.,   1561.,   1497.,   1419.,
          1429.,   1416.,   1492.,   1408.,   1406.,   1352.,   1454.,
          1498.,   1604.,   1604.,   1598.,   1575.,   1615.,   1536.,
          1613.,   1473.,   1511.,   1549.,   1488.,   1438.,   1380.,
          1396.,   1402.,   1457.,   1627.,   1732.,   1889.,   2110.,
          1978.,   1946.,   2160.,   2662.,   2481.,   7695.,   2668.,
          1695.,   1614.,   1196.,    923.,    785.,    708.,    616.,
           547.,    506.,    408.,   3435.]),
 array([ 0.        ,  0.00390625,  0.0078125 ,  0.01171875,  0.015625  ,
         0.01953125,  0.0234375 ,  0.02734375,  0.03125   ,  0.03515625,
         0.0390625 ,  0.04296875,  0.046875  ,  0.05078125,  0.0546875 ,
         0.05859375,  0.0625    ,  0.06640625,  0.0703125 ,  0.07421875,
         0.078125  ,  0.08203125,  0.0859375 ,  0.08984375,  0.09375   ,
         0.09765625,  0.1015625 ,  0.10546875,  0.109375  ,  0.11328125,
         0.1171875 ,  0.12109375,  0.125     ,  0.12890625,  0.1328125 ,
         0.13671875,  0.140625  ,  0.14453125,  0.1484375 ,  0.15234375,
         0.15625   ,  0.16015625,  0.1640625 ,  0.16796875,  0.171875  ,
         0.17578125,  0.1796875 ,  0.18359375,  0.1875    ,  0.19140625,
         0.1953125 ,  0.19921875,  0.203125  ,  0.20703125,  0.2109375 ,
         0.21484375,  0.21875   ,  0.22265625,  0.2265625 ,  0.23046875,
         0.234375  ,  0.23828125,  0.2421875 ,  0.24609375,  0.25      ,
         0.25390625,  0.2578125 ,  0.26171875,  0.265625  ,  0.26953125,
         0.2734375 ,  0.27734375,  0.28125   ,  0.28515625,  0.2890625 ,
         0.29296875,  0.296875  ,  0.30078125,  0.3046875 ,  0.30859375,
         0.3125    ,  0.31640625,  0.3203125 ,  0.32421875,  0.328125  ,
         0.33203125,  0.3359375 ,  0.33984375,  0.34375   ,  0.34765625,
         0.3515625 ,  0.35546875,  0.359375  ,  0.36328125,  0.3671875 ,
         0.37109375,  0.375     ,  0.37890625,  0.3828125 ,  0.38671875,
         0.390625  ,  0.39453125,  0.3984375 ,  0.40234375,  0.40625   ,
         0.41015625,  0.4140625 ,  0.41796875,  0.421875  ,  0.42578125,
         0.4296875 ,  0.43359375,  0.4375    ,  0.44140625,  0.4453125 ,
         0.44921875,  0.453125  ,  0.45703125,  0.4609375 ,  0.46484375,
         0.46875   ,  0.47265625,  0.4765625 ,  0.48046875,  0.484375  ,
         0.48828125,  0.4921875 ,  0.49609375,  0.5       ,  0.50390625,
         0.5078125 ,  0.51171875,  0.515625  ,  0.51953125,  0.5234375 ,
         0.52734375,  0.53125   ,  0.53515625,  0.5390625 ,  0.54296875,
         0.546875  ,  0.55078125,  0.5546875 ,  0.55859375,  0.5625    ,
         0.56640625,  0.5703125 ,  0.57421875,  0.578125  ,  0.58203125,
         0.5859375 ,  0.58984375,  0.59375   ,  0.59765625,  0.6015625 ,
         0.60546875,  0.609375  ,  0.61328125,  0.6171875 ,  0.62109375,
         0.625     ,  0.62890625,  0.6328125 ,  0.63671875,  0.640625  ,
         0.64453125,  0.6484375 ,  0.65234375,  0.65625   ,  0.66015625,
         0.6640625 ,  0.66796875,  0.671875  ,  0.67578125,  0.6796875 ,
         0.68359375,  0.6875    ,  0.69140625,  0.6953125 ,  0.69921875,
         0.703125  ,  0.70703125,  0.7109375 ,  0.71484375,  0.71875   ,
         0.72265625,  0.7265625 ,  0.73046875,  0.734375  ,  0.73828125,
         0.7421875 ,  0.74609375,  0.75      ,  0.75390625,  0.7578125 ,
         0.76171875,  0.765625  ,  0.76953125,  0.7734375 ,  0.77734375,
         0.78125   ,  0.78515625,  0.7890625 ,  0.79296875,  0.796875  ,
         0.80078125,  0.8046875 ,  0.80859375,  0.8125    ,  0.81640625,
         0.8203125 ,  0.82421875,  0.828125  ,  0.83203125,  0.8359375 ,
         0.83984375,  0.84375   ,  0.84765625,  0.8515625 ,  0.85546875,
         0.859375  ,  0.86328125,  0.8671875 ,  0.87109375,  0.875     ,
         0.87890625,  0.8828125 ,  0.88671875,  0.890625  ,  0.89453125,
         0.8984375 ,  0.90234375,  0.90625   ,  0.91015625,  0.9140625 ,
         0.91796875,  0.921875  ,  0.92578125,  0.9296875 ,  0.93359375,
         0.9375    ,  0.94140625,  0.9453125 ,  0.94921875,  0.953125  ,
         0.95703125,  0.9609375 ,  0.96484375,  0.96875   ,  0.97265625,
         0.9765625 ,  0.98046875,  0.984375  ,  0.98828125,  0.9921875 ,
         0.99609375,  1.        ]),
 <a list of 256 Patch objects>)

In [38]:
#Give red, blue and green different names
redimg=img[:,:,0]
blueimg=img[:,:,1]
greenimg=img[:,:,2]

#Make an array of plots of the histograms of their pixel values
fig, ax = plt.subplots(1,3, figsize=(10,6))
ax[0].hist(redimg.ravel(), bins=256, range=(0.0, 1.0), fc='r', ec='r')
ax[1].hist(blueimg.ravel(), bins=256, range=(0.0, 1.0), fc='b', ec='b')
ax[2].hist(greenimg.ravel(), bins=256, range=(0.0, 1.0), fc='g', ec='g')

for a in ax:
    a.set_xticklabels([])
    a.set_yticklabels([])
plt.show()


In Class Exercise

2D Plotting of the Surface of Silicon

There is a file in the Github repository called stm.txt, which contains a grid of values from scanning tunneling microscope measurements of the (111) surface of silicon.

A scanning tunneling microscope (STM) is a device that measures the shape of a surface at the atomic level by tracking a sharp tip over the surface and measuring quantum tunneling current as a function of position. The end result is a grid of values that represent the height of the surface and the file stm.txt contains just such a grid of values.

Write a program that reads the data contained in the file and makes

  1. A histogram of the values
  2. A density plot of the values.

Use the various options and variants you have learned about to make a picture that shows the structure of the silicon surface clearly (This is Exercise 3.3 from Newman).

3D Plotting and Animations

Matplotlib has many options and function.

We will just run through a simple 3D plot, and a simple animation so you can get a sense of the possibilities, there is a lot more that you could do, and I am not an expert in this at all, so I will just point you to the Matplotlib tutorials to go further.

3D Plotting

For the most basic 3D plot:


In [31]:
from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')


The below code is an example plot of a function in 3D taken from


In [32]:
from mpl_toolkits.mplot3d import Axes3D
from matplotlib import cm
from matplotlib.ticker import LinearLocator, FormatStrFormatter
import matplotlib.pyplot as plt
import numpy as np

fig = plt.figure()
ax = fig.gca(projection='3d')
X = np.arange(-5, 5, 0.25)
Y = np.arange(-5, 5, 0.25)
X, Y = np.meshgrid(X, Y)
R = np.sqrt(X**2 + Y**2)
Z = np.sin(R)
surf = ax.plot_surface(X, Y, Z, rstride=1, cstride=1, cmap=cm.coolwarm,
                       linewidth=0, antialiased=False)
ax.set_zlim(-1.01, 1.01)

ax.zaxis.set_major_locator(LinearLocator(10))
ax.zaxis.set_major_formatter(FormatStrFormatter('%.02f'))

fig.colorbar(surf, shrink=0.5, aspect=5)

plt.show()


Animation

In Newman, both 3D graphics and animations make use of the visual package, however this does not appear to be installed as standard in Anaconda.

Matplotlib also has animation packages.

Animation is a place where Notebooks are not yet fully integrated, so we'll go through this in the terminal.

Animation is also more system dependent - don't forget the "Google the error message" method of debugging.

(e.g. I have to set blit=False in the animation keywords on a Mac, it should work with either on Windows/Linux but may be faster with blit=True).