Massimo Nocentini, PhD.

February 7, 2020: init

February 7, 2020: init

A (very concise) introduction to matplotlib.

```
In [1]:
```__AUTHORS__ = {'am': ("Andrea Marino",
"andrea.marino@unifi.it",),
'mn': ("Massimo Nocentini",
"massimo.nocentini@unifi.it",
"https://github.com/massimo-nocentini/",)}
__KEYWORDS__ = ['Python', 'Jupyter', 'matplotlib', 'keynote',]

We'll now take an in-depth look at the Matplotlib package for visualization in Python. Matplotlib is a multi-platform data visualization library built on NumPy arrays, and designed to work with the broader SciPy stack. It was conceived by John Hunter in 2002, originally as a patch to IPython for enabling interactive MATLAB-style plotting via gnuplot from the IPython command line. IPython's creator, Fernando Perez, was at the time scrambling to finish his PhD, and let John know he wouldn’t have time to review the patch for several months. John took this as a cue to set out on his own, and the Matplotlib package was born, with version 0.1 released in 2003. It received an early boost when it was adopted as the plotting package of choice of the Space Telescope Science Institute (the folks behind the Hubble Telescope), which financially supported Matplotlib’s development and greatly expanded its capabilities.

One of Matplotlib’s most important features is its ability to play well with many operating systems and graphics backends. Matplotlib supports dozens of backends and output types, which means you can count on it to work regardless of which operating system you are using or which output format you wish. This cross-platform, everything-to-everyone approach has been one of the great strengths of Matplotlib. It has led to a large user base, which in turn has led to an active developer base and Matplotlib’s powerful tools and ubiquity within the scientific Python world.

```
In [1]:
```import matplotlib as mpl
import matplotlib.pyplot as plt

```
In [2]:
```plt.style.use('classic')

The IPython notebook is a browser-based interactive data analysis tool that can combine narrative, code, graphics, HTML elements.

Plotting interactively within an IPython notebook can be done with the `%matplotlib`

command, and works in a similar way to the IPython shell.
In the IPython notebook, you also have the option of embedding graphics directly in the notebook, with two possible options:

`%matplotlib notebook`

will lead to*interactive*plots embedded within the notebook`%matplotlib inline`

will lead to*static*images of your plot embedded in the notebook

For this book, we will generally opt for `%matplotlib inline`

:

```
In [5]:
```%matplotlib inline

```
In [6]:
```import numpy as np
x = np.linspace(0, 10, 100)
fig = plt.figure()
plt.plot(x, np.sin(x), '-')
plt.plot(x, np.cos(x), '--');

```
```

```
In [7]:
```fig.savefig('my_figure.png')

We now have a file called `my_figure.png`

in the current working directory:

```
In [8]:
```!ls -lh my_figure.png

```
```

`savefig()`

, the file format is inferred from the extension of the given filename.
Depending on what backends you have installed, many different file formats are available.
The list of supported file types can be found for your system by using the following method of the figure canvas object:

```
In [9]:
```fig.canvas.get_supported_filetypes()

```
Out[9]:
```

```
In [10]:
```plt.figure() # create a plot figure
# create the first of two panels and set current axis
plt.subplot(2, 1, 1) # (rows, columns, panel number)
plt.plot(x, np.sin(x))
# create the second panel and set current axis
plt.subplot(2, 1, 2)
plt.plot(x, np.cos(x));

```
```

The object-oriented interface is available for these more complicated situations, and for when you want more control over your figure.
Rather than depending on some notion of an "active" figure or axes, in the object-oriented interface the plotting functions are *methods* of explicit `Figure`

and `Axes`

objects.
To re-create the previous plot using this style of plotting, you might do the following:

```
In [11]:
```# First create a grid of plots
# ax will be an array of two Axes objects
fig, ax = plt.subplots(2)
# Call plot() method on the appropriate object
ax[0].plot(x, np.sin(x))
ax[1].plot(x, np.cos(x));

```
```

Perhaps the simplest of all plots is the visualization of a single function $y = f(x)$. Here we will take a first look at creating a simple plot of this type. As with all the following sections, we'll start by setting up the notebook for plotting and importing the packages we will use:

```
In [3]:
```%matplotlib inline
import matplotlib.pyplot as plt
plt.style.use('seaborn-whitegrid')
import numpy as np

```
In [13]:
```fig = plt.figure()
ax = plt.axes()

```
```

*figure* (an instance of the class `plt.Figure`

) can be thought of as a single container that contains all the objects representing axes, graphics, text, and labels.
The *axes* (an instance of the class `plt.Axes`

) is what we see above: a bounding box with ticks and labels, which will eventually contain the plot elements that make up our visualization.
Throughout this book, we'll commonly use the variable name `fig`

to refer to a figure instance, and `ax`

to refer to an axes instance or group of axes instances.

`ax.plot`

function to plot some data. Let's start with a simple sinusoid:

```
In [14]:
```fig = plt.figure()
ax = plt.axes()
x = np.linspace(0, 10, 1000)
ax.plot(x, np.sin(x));

```
```

```
In [15]:
```plt.plot(x, np.sin(x));

```
```

`plot`

function multiple times:

```
In [16]:
```plt.plot(x, np.sin(x))
plt.plot(x, np.cos(x));

```
```

The first adjustment you might wish to make to a plot is to control the line colors and styles.
The `plt.plot()`

function takes additional arguments that can be used to specify these.
To adjust the color, you can use the `color`

keyword, which accepts a string argument representing virtually any imaginable color.

```
In [17]:
```plt.plot(x, np.sin(x - 0), color='blue') # specify color by name
plt.plot(x, np.sin(x - 1), color='g') # short color code (rgbcmyk)
plt.plot(x, np.sin(x - 2), color='0.75') # Grayscale between 0 and 1
plt.plot(x, np.sin(x - 3), color='#FFDD44') # Hex code (RRGGBB from 00 to FF)
plt.plot(x, np.sin(x - 4), color=(1.0,0.2,0.3)) # RGB tuple, values 0 to 1
plt.plot(x, np.sin(x - 5), color='chartreuse'); # all HTML color names supported

```
```

Similarly, the line style can be adjusted using the `linestyle`

keyword:

```
In [18]:
```plt.plot(x, x + 0, linestyle='solid')
plt.plot(x, x + 1, linestyle='dashed')
plt.plot(x, x + 2, linestyle='dashdot')
plt.plot(x, x + 3, linestyle='dotted');
# For short, you can use the following codes:
plt.plot(x, x + 4, linestyle='-') # solid
plt.plot(x, x + 5, linestyle='--') # dashed
plt.plot(x, x + 6, linestyle='-.') # dashdot
plt.plot(x, x + 7, linestyle=':'); # dotted

```
```

`linestyle`

and `color`

codes can be combined into a single non-keyword argument to the `plt.plot()`

function:

```
In [19]:
```plt.plot(x, x + 0, '-g') # solid green
plt.plot(x, x + 1, '--c') # dashed cyan
plt.plot(x, x + 2, '-.k') # dashdot black
plt.plot(x, x + 3, ':r'); # dotted red

```
```

```
In [20]:
```plt.plot(x, np.sin(x))
plt.xlim(-1, 11)
plt.ylim(-1.5, 1.5);

```
```

```
In [21]:
```plt.plot(x, np.sin(x))
plt.xlim(10, 0)
plt.ylim(1.2, -1.2);

```
```

`plt.axis()`

(note here the potential confusion between *axes* with an *e*, and *axis* with an *i*).
The `plt.axis()`

method allows you to set the `x`

and `y`

limits with a single call, by passing a list which specifies `[xmin, xmax, ymin, ymax]`

:

```
In [22]:
```plt.plot(x, np.sin(x))
plt.axis([-1, 11, -1.5, 1.5]);

```
```

`plt.axis()`

method goes even beyond this, allowing you to do things like automatically tighten the bounds around the current plot:

```
In [23]:
```plt.plot(x, np.sin(x))
plt.axis('tight');

```
```

`x`

is equal to one unit in `y`

:

```
In [24]:
```plt.plot(x, np.sin(x))
plt.axis('equal');

```
```

```
In [25]:
```plt.plot(x, np.sin(x))
plt.title("A Sine Curve")
plt.xlabel("x")
plt.ylabel("sin(x)");

```
```

`plt.legend()`

method.
Though there are several valid ways of using this, I find it easiest to specify the label of each line using the `label`

keyword of the plot function:

```
In [26]:
```plt.plot(x, np.sin(x), '-g', label='sin(x)')
plt.plot(x, np.cos(x), ':b', label='cos(x)')
plt.axis('equal')
plt.legend();

```
```

While most `plt`

functions translate directly to `ax`

methods (such as `plt.plot()`

→ `ax.plot()`

, `plt.legend()`

→ `ax.legend()`

, etc.), this is not the case for all commands.
In particular, functions to set limits, labels, and titles are slightly modified.
For transitioning between MATLAB-style functions and object-oriented methods, make the following changes:

`plt.xlabel()`

→`ax.set_xlabel()`

`plt.ylabel()`

→`ax.set_ylabel()`

`plt.xlim()`

→`ax.set_xlim()`

`plt.ylim()`

→`ax.set_ylim()`

`plt.title()`

→`ax.set_title()`

In the object-oriented interface to plotting, rather than calling these functions individually, it is often more convenient to use the `ax.set()`

method to set all these properties at once:

```
In [27]:
```ax = plt.axes()
ax.plot(x, np.sin(x))
ax.set(xlim=(0, 10), ylim=(-2, 2), xlabel='x', ylabel='sin(x)', title='A Simple Plot');

```
```

Another commonly used plot type is the simple scatter plot, a close cousin of the line plot. Instead of points being joined by line segments, here the points are represented individually with a dot, circle, or other shape. We’ll start by setting up the notebook for plotting and importing the functions we will use:

```
In [4]:
```%matplotlib inline
import matplotlib.pyplot as plt
plt.style.use('seaborn-whitegrid')
import numpy as np

```
In [5]:
```x = np.linspace(0, 10, 30)
y = np.sin(x)
plt.plot(x, y, 'o', color='black');

```
```

`'-'`

, `'--'`

to control the line style, the marker style has its own set of short string codes. The full list of available symbols can be seen in the documentation of `plt.plot`

, or in Matplotlib's online documentation. Most of the possibilities are fairly intuitive, and we'll show a number of the more common ones here:

```
In [6]:
```rng = np.random.RandomState(0)
for marker in ['o', '.', ',', 'x', '+', 'v', '^', '<', '>', 's', 'd']:
plt.plot(rng.rand(5), rng.rand(5), marker, label="marker='{0}'".format(marker))
plt.legend(numpoints=1)
plt.xlim(0, 1.8);

```
```

```
In [7]:
```plt.plot(x, y, '-ok');

```
```

`plt.plot`

specify a wide range of properties of the lines and markers:

```
In [8]:
```plt.plot(x, y, '-p', color='gray',
markersize=15, linewidth=4,
markerfacecolor='white',
markeredgecolor='gray',
markeredgewidth=2)
plt.ylim(-1.2, 1.2);

```
```

```
In [9]:
```plt.scatter(x, y, marker='o');

```
```

`plt.scatter`

from `plt.plot`

is that it can be used to create scatter plots where the properties of each individual point (size, face color, edge color, etc.) can be individually controlled or mapped to data.

`alpha`

keyword to adjust the transparency level:

```
In [10]:
```rng = np.random.RandomState(0)
x = rng.randn(100)
y = rng.randn(100)
colors = rng.rand(100)
sizes = 1000 * rng.rand(100)
plt.scatter(x, y, c=colors, s=sizes, alpha=0.3, cmap='viridis')
plt.colorbar(); # show color scale

```
```

`colorbar()`

command), and that the size argument is given in pixels.
In this way, the color and size of points can be used to convey information in the visualization, in order to visualize multidimensional data.

```
In [11]:
```from sklearn.datasets import load_iris
iris = load_iris()
features = iris.data.T
plt.scatter(features[0], features[1], alpha=0.2, s=100*features[3], c=iris.target, cmap='viridis')
plt.xlabel(iris.feature_names[0])
plt.ylabel(iris.feature_names[1]);

```
```

`plot`

Versus `scatter`

: A Note on EfficiencyAside from the different features available in `plt.plot`

and `plt.scatter`

, why might you choose to use one over the other? While it doesn't matter as much for small amounts of data, as datasets get larger than a few thousand points, `plt.plot`

can be noticeably more efficient than `plt.scatter`

.
The reason is that `plt.scatter`

has the capability to render a different size and/or color for each point, so the renderer must do the extra work of constructing each point individually.
In `plt.plot`

, on the other hand, the points are always essentially clones of each other, so the work of determining the appearance of the points is done only once for the entire set of data.
For large datasets, the difference between these two can lead to vastly different performance, and for this reason, `plt.plot`

should be preferred over `plt.scatter`

for large datasets.

For any scientific measurement, accurate accounting for errors is nearly as important, if not more important, than accurate reporting of the number itself. For example, imagine that I am using some astrophysical observations to estimate the Hubble Constant, the local measurement of the expansion rate of the Universe. I know that the current literature suggests a value of around 71 (km/s)/Mpc, and I measure a value of 74 (km/s)/Mpc with my method. Are the values consistent? The only correct answer, given this information, is this: there is no way to know.

Suppose I augment this information with reported uncertainties: the current literature suggests a value of around 71 $\pm$ 2.5 (km/s)/Mpc, and my method has measured a value of 74 $\pm$ 5 (km/s)/Mpc. Now are the values consistent? That is a question that can be quantitatively answered.

In visualization of data and results, showing these errors effectively can make a plot convey much more complete information.

```
In [12]:
```%matplotlib inline
import matplotlib.pyplot as plt
plt.style.use('seaborn-whitegrid')
import numpy as np

`fmt`

is a format code controlling the appearance of lines and points, and has the same syntax as the shorthand used in `plt.plot`

.

```
In [13]:
```x = np.linspace(0, 10, 50)
dy = 0.8
y = np.sin(x) + dy * np.random.randn(50)
plt.errorbar(x, y, yerr=dy, fmt='.k');

```
```

`errorbar`

function has many options to fine-tune the outputs.
Using these additional options you can easily customize the aesthetics of your errorbar plot.
I often find it helpful, especially in crowded plots, to make the errorbars lighter than the points themselves:

```
In [14]:
```plt.errorbar(x, y, yerr=dy, fmt='o', color='black', ecolor='lightgray', elinewidth=3, capsize=0);

```
```

Sometimes it is useful to display three-dimensional data in two dimensions using contours or color-coded regions.
There are three Matplotlib functions that can be helpful for this task: `plt.contour`

for contour plots, `plt.contourf`

for filled contour plots, and `plt.imshow`

for showing images.
This section looks at several examples of using these. We'll start by setting up the notebook for plotting and importing the functions we will use:

```
In [23]:
```%matplotlib inline
import matplotlib.pyplot as plt
plt.style.use('seaborn-white')
import numpy as np

We'll start by demonstrating a contour plot using a function $z = f(x, y)$, using the following particular choice for $f$ (we've seen this before in Computation on Arrays: Broadcasting, when we used it as a motivating example for array broadcasting):

```
In [24]:
```def f(x, y):
return np.sin(x) ** 10 + np.cos(10 + y * x) * np.cos(x)

`plt.contour`

function.
It takes three arguments: a grid of *x* values, a grid of *y* values, and a grid of *z* values.
The *x* and *y* values represent positions on the plot, and the *z* values will be represented by the contour levels.
Perhaps the most straightforward way to prepare such data is to use the `np.meshgrid`

function, which builds two-dimensional grids from one-dimensional arrays:

```
In [25]:
```x = np.linspace(0, 5, 50)
y = np.linspace(0, 5, 40)
X, Y = np.meshgrid(x, y)
Z = f(X, Y)

Now let's look at this with a standard line-only contour plot:

```
In [26]:
```plt.contour(X, Y, Z, colors='black');

```
```

`cmap`

argument.
Here, we'll also specify that we want more lines to be drawn—20 equally spaced intervals within the data range:

```
In [29]:
```plt.contour(X, Y, Z, 20, cmap='RdGy');

```
```

Here we chose the `RdGy`

(short for *Red-Gray*) colormap, which is a good choice for centered data.
Matplotlib has a wide range of colormaps available, which you can easily browse in IPython by doing a tab completion on the `plt.cm`

module:

`plt.cm.<TAB>`

Our plot is looking nicer, but the spaces between the lines may be a bit distracting.
We can change this by switching to a filled contour plot using the `plt.contourf()`

function (notice the `f`

at the end), which uses largely the same syntax as `plt.contour()`

.

Additionally, we'll add a `plt.colorbar()`

command, which automatically creates an additional axis with labeled color information for the plot:

```
In [31]:
```plt.contourf(X, Y, Z, 20, cmap='RdGy')
plt.colorbar(); # The colorbar makes it clear that the black regions
# are "peaks," while the red regions are "valleys."

```
```

`plt.imshow()`

function, which interprets a two-dimensional grid of data as an image. The following code shows this:

```
In [32]:
```plt.imshow(Z, extent=[0, 5, 0, 5], origin='lower', cmap='RdGy')
plt.colorbar()
plt.axis(aspect='image');

```
```

There are a few potential gotchas with `imshow()`

, however:

`plt.imshow()`

doesn't accept an*x*and*y*grid, so you must manually specify the*extent*[*xmin*,*xmax*,*ymin*,*ymax*] of the image on the plot.`plt.imshow()`

by default follows the standard image array definition where the origin is in the upper left, not in the lower left as in most contour plots. This must be changed when showing gridded data.`plt.imshow()`

will automatically adjust the axis aspect ratio to match the input data; this can be changed by setting, for example,`plt.axis(aspect='image')`

to make*x*and*y*units match.

`alpha`

parameter) and overplot contours with labels on the contours themselves (using the `plt.clabel()`

function):

```
In [33]:
```contours = plt.contour(X, Y, Z, 3, colors='black')
plt.clabel(contours, inline=True, fontsize=8)
plt.imshow(Z, extent=[0, 5, 0, 5], origin='lower', cmap='RdGy', alpha=0.5)
plt.colorbar();

```
```