This lesson requires the Basemap toolkit for Matplotlib. This library is not distributed with Matplotlib directly. If you are using Continuum's Anaconda distribution, you can obtain the library using:
conda install basemap
If you are using Enthought Canopy and have the full version or an academic license, Basemap should already be installed on your system. Otherwise, you will need to follow the installation instructions on the Basemap documentation. Using one of the two scientific distributions is preferred in most instances.
Original materials by Joshua Adelman; modified by Randy Olson
We are examining some simple spatial coordinate data, specifically the location of all of the previous Software Carpentry bootcamps. The data set is stored in comma-separated values (CSV) format. After the header line (marked with a #
), each row contains the latitude and longitude for each bootcamp, separated by a comma.
# Latitude, Longitude
43.661476,-79.395189
39.332604,-76.623190
45.703255, 13.718013
43.661476,-79.395189
39.166381,-86.526621
...
We want to:
To do this, we'll begin to delve into working with Python and do a bit of programming.
In order to work with the coordinates stored in the file, we need to import a library called NumPy that is designed to easily handle arrays of data.
In [1]:
import numpy as np
It's very common to create an alias for a library when importing it
in order to reduce the amount of typing we have to do. We can now refer to this library in the code as np
instead of typing out numpy
each time we want to use it.
We can now ask numpy to read our data file:
In [2]:
lat, lon = np.loadtxt('swc_bc_coords.csv', delimiter=',', unpack=True)
The expression np.loadtxt(...)
means,
"Run the function loadtxt
that belongs to the numpy
library."
This dotted notation is used everywhere in Python
to refer to the parts of larger things.
np.loadtxt
has three parameters:
the name of the file we want to read,
and the delimiter that separates values on a line.
These both need to be character strings (or strings for short),
so we put them in quotes.
Finally, passing the unpack
paramter the boolean value, True
tells np.loadtxt
to take the first and second column of data and assign them to the variables lat
and lon
, respectively.
A variable is just a name for some data.
Also note that np.loadtxt
automatically skipped the line with the header information, since it recognizes that
this line is a comment and does not contain numerical data.
When we are finished typing and press Shift+Enter, the notebook runs our command.
lat
and lon
now contain our data, which we can inspect by just executing a cell with the name of a variable:
In [3]:
lat
Out[3]:
The array is a type of container defined by numpy to hold values. We will discuss how to manipulate arrays in more detail in another lesson.
For now let's just make a simple plot of the data. For this, we will use another library called matplotlib
. First, let's tell the IPython Notebook that we want our plots displayed inline, rather than in a separate viewing window:
In [4]:
%matplotlib inline
The %
at the start of the line signals that this is a command for the notebook,
rather than a statement in Python.
Next,
we will import the pyplot
module from matplotlib
and use one of the commands it defines to make plot a point for each latitude, longitude pair of data.
In [5]:
from matplotlib import pyplot
pyplot.plot(lon, lat, 'o')
Out[5]:
Plot the dots with a different color according to the continent they would be on.
In [5]:
While matplotlib provides a simple facility for visualizing numerical data in a variety of ways, we will use a supplementary toolkit called Basemap that enhances matplotlib to specifically deal with spatial data. We need to import this library and can do so using:
In [6]:
from mpl_toolkits.basemap import Basemap
Now let's create a Basemap object that will allow us to project the coordinates onto map. For this example we are going to use a Robinson Projection.
In [7]:
basemap_graph = Basemap(projection='robin', lat_0=0.0, lon_0=0.0)
The parameters lat_0
and lon_0
define the center of the map. Now let's add some features to our map using methods defined by the bm
object. We will also use the object itself to get the coordinates of the bootcamps in the projection given our original longitudes and latitudes. We will also tell pyplot to make the figure 12 inches by 12 inches to make it more legible.
In [8]:
pyplot.figure(figsize=(12,12))
basemap_graph.drawcoastlines()
basemap_graph.drawcountries()
basemap_graph.fillcontinents()
basemap_graph.drawmeridians(np.arange(-180,180,20))
basemap_graph.drawparallels(np.arange(-90,90,20))
x, y = basemap_graph(lon, lat)
basemap_graph.plot(x, y, 'o', markersize=4, color='red')
Out[8]:
The final line of the above code cell mimics matplotlib's built-in plot
method to plot our projected coordinates onto the map.
With just a handful of lines of code, you see that we can create a rich visualization of our data.
In [8]: