Python Hour - Aug. 11, 2014

Meeting summary

  • Cim mentioned there is a Python toolbox called x-ray (short for eXtended arRAYs) that is designed to make working with multi-dimensional datasets a lot easier. xray allows you to add dimension names and coordinate values to numpy ndarrays. You can then use those dimension names and coordinate values to perform operations over the array. For example, with xray, you can find the time mean of an array by doing something like x.sum('time') - rather than x.sum(axis=0), if time was on the first dimension.
  • Jake asked about how to set the x-y aspect ratio of a plot. We talked about many ways of doing this. The easiest solution is probably to use the axes method set_aspect. An minimal working example is provided below.
  • Earle talked about different ways of saving Python data to file. Several options were mentioned; for e.g. (most) data can be saved as .mat, netcdf or HDF files. However, the standard way of saving Python objects to file is to use the pickle module. The syntax for saving objects using pickle is pickle.dump(obj,file,protocol). You can find examples of how to use pickle here. By default, pickle.dump(...) tries to save data in a human-readable format. This is ok for small, simple objects but saving large n-dimensional arrays this way is very slow and results in files that take up a lot of disk space. The protocol parameter allows users to change this behaviour. Setting protocol = pickle.HIGHEST_PROTOCOL or protocol = -1, tells pickle to use the fastest, most memory efficient option available. By using the highest protocol, you can save files that take up as much space as a netcdf or .mat file containing the same data.
  • Earle briefly showed how to make a plot with broken axes. An example is given below.
  • We had a pretty lengthy discussion about version control software - git, in particular. JPaul brought up the git rebase command and showed how it can be useful when pushing changes to a group repository. We also tried to understand the difference between git checkout and git reset. Cim mentioned that bitbucket is an alternative to github and offers free, online private repositories.
  • Jake says that PyHOG (PYthon Hour for Oceanographers and Geoscientists) sounds a lot better than GeoPUG (Geoscience Python User Group). So, PyHOG it is...

Making a plot with fixed aspect-ratio

To fix the aspect ratio of a plot, we use the set_aspect method from the axes class. But first, here is an example of what happens when you don't fix the aspect ratio:


In [1]:
import matplotlib.pyplot as plt
import numpy as np

x = np.arange(0,pi,0.01)
y = np.sin(20*x)

plt.figure(figsize = (5,4))
plt.plot(x,y)
plt.title('plot with undefined aspect ratio')
plt.show()


Now if we change the figure size, the plot gets distored:


In [2]:
plt.figure(figsize=(10,3))
plt.plot(x,y)
plt.title('plot with undefined aspect ratio')
plt.show()


By fixing the aspect ratio we can prevent future distortions. Below, we use the same figure size as the plot above:


In [3]:
plt.figure(figsize=(10,3)) #fig size same as before
ax = plt.gca() #you first need to get the axis handle
ax.set_aspect(1.5) #sets the height to width ratio to 1.5. 
#Use help(ax.set_aspect) for more options.

plt.plot(x,y)
plt.title('plot with Fixed aspect ratio')
plt.show()


Creating a plot with "broken axes"

Scenario: Say we have profile data from the deep ocean. The profile is for the full depth of the ocean but most of the signal is in the upper ocean. We want to make a plot of the full profile but have a zoomed in look at the upper ocean. We can do this by manipulating the subplot feature.

First lets make up some data:

We will create a velocity profile that has the form:

$vel = A\, e^{-\lambda\,z} \cdot \sin \left(\frac{a \,\pi\,z}{D} \right)\cdot \sin \left(\frac{b\,\pi\,z}{D} \right)$

where $a$, $b$, $A$ and $\lambda$ are arbitrary constants. $D$ is the full depth of the profile.


In [4]:
import numpy as np

z = arange(0,4000,10)
D = max(z)
a,b,A = 5,75,10 
lam = 0.01 

vel = A*np.exp(-lam*z)*np.sin((a*pi*z)/D)*np.sin((b*pi*z)/D)

First, we plot the profile using a single plot:


In [6]:
import matplotlib.pylab as plt

#create figure
plt.figure()

#get axes handle
ax = plt.gca()

#flip y-axis 
ax.invert_yaxis()

#plot data
plt.plot(vel,z)

#add labels
plt.xlabel('velocity profile')
plt.ylabel('Depth (m)')
plt.title('A velocity profile')

#display the plot
plt.show()


The plot looks fine but the information we care about is squashed in the upper portion of the plot. To showcase the signal more clearly, we will create a zoomed in version of the upper 500m. To do this we exploit the subplot function:


In [11]:
def brokenAxesDemo(vel,z):
    
    #I am  making this code into a function so that I can use it again in this demo.
    #It's not a very useful function since a lot of things are hard coded.

    #create a new figure with two subplots
    fig,(ax1,ax2) = plt.subplots(2, 1, sharex=True)

    #set the "zoom" or the y-limits on each subplots
    ax1.set_ylim(0,500)
    ax2.set_ylim(600,4000)
    
    #set ytick marks for upper plot (optional, but the default tick marks may not look nice)
    upper_yticks = np.arange(0,501,100) 
    ax1.set_yticks(upper_yticks)
    
    #set ytick marks for lower plot (optional, but the default tick marks may not look nice)
    lower_yticks = np.arange(600,4000,1000)
    ax2.set_yticks(lower_yticks)
    
    #invert the y-axis (since we are plotting profile data)
    ax1.invert_yaxis()
    ax2.invert_yaxis()

    #now plot data into each subplot
    ax1.plot(vel,z)
    ax2.plot(vel,z)
    
    #remove the bottom border from the top plot and the upper border from the bottom plot
    ax1.spines['bottom'].set_visible(False)
    ax2.spines['top'].set_visible(False)
    
    #show only top x-axis tick marks on the top plot
    ax1.xaxis.tick_top() 
    ax1.tick_params(labeltop='off') # hides the labels from the top x-axis
    
    #show only bottom x-axis tick marks on the lower plot
    ax2.xaxis.tick_bottom()
    
    
    #squeeze plots closer
    plt.subplots_adjust(hspace=0.2) #set to zero, if you want to join the two plots
    
    
    #add figure labels
    ax2.set_xlabel('velocity')
    ax1.set_ylabel('Depth (m)')
    ax2.set_ylabel('Depth (m)')
    ax1.set_title('Velocity profile after zooming into upper 500m.')
    
    return ax1, ax2

In [12]:
ax1,ax2 = brokenAxesDemo(vel,z)


So now we can see the upper ocean signal more clearly. All this has been pretty painless so far. We may also want to add some slashes to indicate where the y-axis is broken. Doing this requires a bit more work, but you can save yourself the trouble by copying the function below:


In [13]:
def addBreakClips(ax1,ax2):
    """ Code to add diagonal slashes to truncated y-axes.
    copied from http://matplotlib.org/examples/pylab_examples/broken_axis.html"""
    
    d = .015 # how big to make the diagonal lines in axes coordinates
    # arguments to pass plot, just so we don't keep repeating them
    kwargs = dict(transform=ax1.transAxes, color='r', clip_on=False)
    ax1.plot((-d,+d),(-d,+d), **kwargs)      # top-left diagonal
    ax1.plot((1-d,1+d),(-d,+d), **kwargs)    # top-right diagonal

    kwargs.update(transform=ax2.transAxes)  # switch to the bottom axes
    ax2.plot((-d,+d),(1-d,1+d), **kwargs)   # bottom-left diagonal
    ax2.plot((1-d,1+d),(1-d,1+d), **kwargs) # bottom-right diagonal

Let's remake the plot and add the slashes:


In [14]:
ax1,ax2 = brokenAxesDemo(vel,z)
addBreakClips(ax1,ax2)


Hopefully this demo makes sense. The trickiest part is probably choosing the right tick marks on the y-axis. It should be relatively easy to modify the code to create a broken x-axis.