Class 4: matplotlib (and a quick Numpy example)

Brief introduction to the matplotlib module.

Preliminary example: Economic growth

A country with GDP in year $t-1$ denoted by $y_{t-1}$ and an annual GDP growth rate of $g$, will have GDP in year $t$ given by the recursive equation:

\begin{align} y_{t} & = (1+g)y_{t-1} \end{align}

Given an initial value of $y_0$, we can find $y_t$ for any given $t$ in one of two ways:

  1. By iterating on the equation
  2. Or by using substitution and deriving: \begin{align} y_t & = (1+g)^t y_0 \end{align}

In this example we'll do both.

Example: Economic growth

A country with GDP in year $t-1$ denoted by $y_{t-1}$ and an annual GDP growth rate of $g$, will have GDP in year $t$ given by the recursive equation:

\begin{align} y_{t} & = (1+g)y_{t-1} \end{align}

Given an initial value of $y_0$, we can find $y_t$ for any given $t$ in one of two ways:

  1. By iterating on the equation
  2. Or by using substitution and deriving: \begin{align} y_t & = (1+g)^t y_0 \end{align}

In this example we'll do both.


In [1]:
# Import numpy
import numpy as np

# Define T and g
T = 40
y0 =50 
g = 0.01

# Compute yT using the direct approach and print
yT = (1+g)**T*y0
print('Direct approach:   ',yT)

# Initialize a 1-dimensional array called y that has T+1 zeros 
y = np.zeros(T+1)

# Set the initial value of y to equal y0
y[0] = y0

# Use a for loop to update the values of y one at a time
for t in np.arange(T):
    y[t+1] = (1+g)*y[t]


# Print the final value in the array y
print('Iterative approach:',y[-1])


Direct approach:    74.44318667941107
Iterative approach: 74.4431866794

matplotlib

matplotlib is a powerful plotting module that is part of Python's standard library. The website for matplotlib is at http://matplotlib.org/. And you can find a bunch of examples at the following two locations: http://matplotlib.org/examples/index.html and http://matplotlib.org/gallery.html.

matplotlib contains a module called pyplot that was written to provide a Matlab-style ploting interface.


In [2]:
# Import matplotlib.pyplot
import matplotlib.pyplot as plt

Next, we want to make sure that the plots that we create are displayed in this notebook. To achieve this we have to issue a command to be interpretted by Jupyter -- called a magic command. A magic command is preceded by a % character. Magics are not Python and will create errs if used outside of the Jupyter notebook


In [3]:
# Magic command for the Jupyter Notebook
%matplotlib inline

A quick matplotlib example

Create a plot of the sine function for x values between -6 and 6. Add axis labels and a title.


In [4]:
# Import numpy as np
import numpy as np

# Create an array of x values from -6 to 6
x = np.arange(-6,6,0.001)

# Create a variable y equal to the sin of x
y = np.sin(x)

# Use the plot function to plot the 
plt.plot(x,y)

# Add a title and axis labels
plt.title('sin(x)')
plt.xlabel('x')
plt.ylabel('y')


Out[4]:
<matplotlib.text.Text at 0x10aac2668>

The plot function

The plot function creates a two-dimensional plot of one variable against another.


In [5]:
# Use the help function to see the documentation for plot
help(plt.plot)


Help on function plot in module matplotlib.pyplot:

plot(*args, **kwargs)
    Plot lines and/or markers to the
    :class:`~matplotlib.axes.Axes`.  *args* is a variable length
    argument, allowing for multiple *x*, *y* pairs with an
    optional format string.  For example, each of the following is
    legal::
    
        plot(x, y)        # plot x and y using default line style and color
        plot(x, y, 'bo')  # plot x and y using blue circle markers
        plot(y)           # plot y using x as index array 0..N-1
        plot(y, 'r+')     # ditto, but with red plusses
    
    If *x* and/or *y* is 2-dimensional, then the corresponding columns
    will be plotted.
    
    If used with labeled data, make sure that the color spec is not
    included as an element in data, as otherwise the last case
    ``plot("v","r", data={"v":..., "r":...)``
    can be interpreted as the first case which would do ``plot(v, r)``
    using the default line style and color.
    
    If not used with labeled data (i.e., without a data argument),
    an arbitrary number of *x*, *y*, *fmt* groups can be specified, as in::
    
        a.plot(x1, y1, 'g^', x2, y2, 'g-')
    
    Return value is a list of lines that were added.
    
    By default, each line is assigned a different style specified by a
    'style cycle'.  To change this behavior, you can edit the
    axes.prop_cycle rcParam.
    
    The following format string characters are accepted to control
    the line style or marker:
    
    ================    ===============================
    character           description
    ================    ===============================
    ``'-'``             solid line style
    ``'--'``            dashed line style
    ``'-.'``            dash-dot line style
    ``':'``             dotted line style
    ``'.'``             point marker
    ``','``             pixel marker
    ``'o'``             circle marker
    ``'v'``             triangle_down marker
    ``'^'``             triangle_up marker
    ``'<'``             triangle_left marker
    ``'>'``             triangle_right marker
    ``'1'``             tri_down marker
    ``'2'``             tri_up marker
    ``'3'``             tri_left marker
    ``'4'``             tri_right marker
    ``'s'``             square marker
    ``'p'``             pentagon marker
    ``'*'``             star marker
    ``'h'``             hexagon1 marker
    ``'H'``             hexagon2 marker
    ``'+'``             plus marker
    ``'x'``             x marker
    ``'D'``             diamond marker
    ``'d'``             thin_diamond marker
    ``'|'``             vline marker
    ``'_'``             hline marker
    ================    ===============================
    
    
    The following color abbreviations are supported:
    
    ==========  ========
    character   color
    ==========  ========
    'b'         blue
    'g'         green
    'r'         red
    'c'         cyan
    'm'         magenta
    'y'         yellow
    'k'         black
    'w'         white
    ==========  ========
    
    In addition, you can specify colors in many weird and
    wonderful ways, including full names (``'green'``), hex
    strings (``'#008000'``), RGB or RGBA tuples (``(0,1,0,1)``) or
    grayscale intensities as a string (``'0.8'``).  Of these, the
    string specifications can be used in place of a ``fmt`` group,
    but the tuple forms can be used only as ``kwargs``.
    
    Line styles and colors are combined in a single format string, as in
    ``'bo'`` for blue circles.
    
    The *kwargs* can be used to set line properties (any property that has
    a ``set_*`` method).  You can use this to set a line label (for auto
    legends), linewidth, anitialising, marker face color, etc.  Here is an
    example::
    
        plot([1,2,3], [1,2,3], 'go-', label='line 1', linewidth=2)
        plot([1,2,3], [1,4,9], 'rs',  label='line 2')
        axis([0, 4, 0, 10])
        legend()
    
    If you make multiple lines with one plot command, the kwargs
    apply to all those lines, e.g.::
    
        plot(x1, y1, x2, y2, antialiased=False)
    
    Neither line will be antialiased.
    
    You do not need to use format strings, which are just
    abbreviations.  All of the line properties can be controlled
    by keyword arguments.  For example, you can set the color,
    marker, linestyle, and markercolor with::
    
        plot(x, y, color='green', linestyle='dashed', marker='o',
             markerfacecolor='blue', markersize=12).
    
    See :class:`~matplotlib.lines.Line2D` for details.
    
    The kwargs are :class:`~matplotlib.lines.Line2D` properties:
    
      agg_filter: unknown
      alpha: float (0.0 transparent through 1.0 opaque) 
      animated: [True | False] 
      antialiased or aa: [True | False] 
      axes: an :class:`~matplotlib.axes.Axes` instance 
      clip_box: a :class:`matplotlib.transforms.Bbox` instance 
      clip_on: [True | False] 
      clip_path: [ (:class:`~matplotlib.path.Path`, :class:`~matplotlib.transforms.Transform`) | :class:`~matplotlib.patches.Patch` | None ] 
      color or c: any matplotlib color 
      contains: a callable function 
      dash_capstyle: ['butt' | 'round' | 'projecting'] 
      dash_joinstyle: ['miter' | 'round' | 'bevel'] 
      dashes: sequence of on/off ink in points 
      drawstyle: ['default' | 'steps' | 'steps-pre' | 'steps-mid' | 'steps-post'] 
      figure: a :class:`matplotlib.figure.Figure` instance 
      fillstyle: ['full' | 'left' | 'right' | 'bottom' | 'top' | 'none'] 
      gid: an id string 
      label: string or anything printable with '%s' conversion. 
      linestyle or ls: ['solid' | 'dashed', 'dashdot', 'dotted' | (offset, on-off-dash-seq) | ``'-'`` | ``'--'`` | ``'-.'`` | ``':'`` | ``'None'`` | ``' '`` | ``''``]
      linewidth or lw: float value in points 
      marker: :mod:`A valid marker style <matplotlib.markers>`
      markeredgecolor or mec: any matplotlib color 
      markeredgewidth or mew: float value in points 
      markerfacecolor or mfc: any matplotlib color 
      markerfacecoloralt or mfcalt: any matplotlib color 
      markersize or ms: float 
      markevery: [None | int | length-2 tuple of int | slice | list/array of int | float | length-2 tuple of float]
      path_effects: unknown
      picker: float distance in points or callable pick function ``fn(artist, event)`` 
      pickradius: float distance in points 
      rasterized: [True | False | None] 
      sketch_params: unknown
      snap: unknown
      solid_capstyle: ['butt' | 'round' |  'projecting'] 
      solid_joinstyle: ['miter' | 'round' | 'bevel'] 
      transform: a :class:`matplotlib.transforms.Transform` instance 
      url: a url string 
      visible: [True | False] 
      xdata: 1D array 
      ydata: 1D array 
      zorder: any number 
    
    kwargs *scalex* and *scaley*, if defined, are passed on to
    :meth:`~matplotlib.axes.Axes.autoscale_view` to determine
    whether the *x* and *y* axes are autoscaled; the default is
    *True*.
    
    Notes
    -----
    
    In addition to the above described arguments, this function can take a
    **data** keyword argument. If such a **data** argument is given, the
    following arguments are replaced by **data[<arg>]**:
    
    * All arguments with the following names: 'x', 'y'.
    
    
    
    
    Additional kwargs: hold = [True|False] overrides default hold state

Example

Create a plot of $f(x) = x^2$ with $x$ between -2 and 2.

  • Set the linewidth to 3 points
  • Set the line transparency (alpha) to 0.6
  • Set axis labels and title
  • Add a grid to the plot

In [6]:
# Create an array of x values from -6 to 6
x = np.arange(-2,2,0.001)

# Create a variable y equal to the x squared
y = x**2

# Use the plot function to plot the line
plt.plot(x,y,linewidth=3,alpha = 0.6)

# Add a title and axis labels
plt.title('$f(x) = x^2$')
plt.xlabel('x')
plt.ylabel('y')

# Add grid
plt.grid()


Example

Create plots of the functions $f(x) = \log x$ (natural log) and $g(x) = 1/x$ between 0.01 and 5

  • Set the limits for the $x$-axis to (0,5)
  • Set the limits for the $y$-axis to (-2,5)
  • Make the line for $log(x)$ solid blue
  • Make the line for $1/x$ dashd magenta
  • Set the linewidth of each line to 3 points
  • Set the line transparency (alpha) for each line to 0.6
  • Set axis labels and title
  • Add a legend
  • Add a grid to the plot

In [7]:
# Create an array of x values from -6 to 6
x = np.arange(0.05,5,0.011)

# Create y variables
y1 = np.log(x)
y2 = 1/x

# Use the plot function to plot the lines
plt.plot(x,y1,'b-',linewidth=3,alpha = 0.6,label='$log(x)$')
plt.plot(x,y2,'m--',linewidth=3,alpha = 0.6,label='$1/x$')

# Add a title and axis labels
plt.title('Two functions')
plt.xlabel('x')
plt.ylabel('y')

# Set axis limits
plt.xlim([0,5])
plt.ylim([-2,4])

# legend
plt.legend(loc='lower right',ncol=2)

# Add grid
plt.grid()


Example

Consider the linear regression model: \begin{align} y_i = \beta_0 + \beta_1 x_i + \epsilon_i \end{align} where $x_i$ is the independent variable, $\epsilon_i$ is a random regression error term, $y_i$ is the dependent variable and $\beta_0$ and $\beta_1$ are constants.

Let's simulate the model

  • Set values for $\beta_0$ and $\beta_1$
  • Create an array of $x_i$ values from -5 to 5
  • Create an array of $\epsilon_i$ values from the standard normal distribution equal in length to the array of $x_i$s
  • Create an array of $y_i$s
  • Plot y against x with either a circle ('o'), triangle ('^'), or square ('s') marker and transparency (alpha) to 0.5
  • Add axis lables, a title, and a grid to the plot

In [8]:
# Set betas
beta0 = 1
beta1 = -0.5

# Create x values
x = np.arange(-5,5,0.01)

# create epsilon values from the standard normal distribution
epsilon = np.random.normal(size=len(x))

# create y
y = beta0 + beta1*x+epsilon

# plot
plt.plot(x,y,'o',alpha = 0.5)

# Add a title and axis labels
plt.title('Data')
plt.xlabel('x')
plt.ylabel('y')

# Set axis limits
plt.xlim([-5,5])

# Add grid
plt.grid()


Example

Create plots of the functions $f(x) = x$, $g(x) = x^2$, and $h(x) = x^3$ for $x$ between -2 and 2

  • Use the optional string format argument to format the lines:
    • $x$: solid blue line
    • $x^2$: dashed green line
    • $x^3$: dash-dot magenta line
  • Set the linewidth of each line to 3 points
  • Set transparency (alpha) for each line to 0.6
  • Add a legend to lower right with 3 columns
  • Set axis labels and title
  • Add a grid to the plot

In [9]:
# Create an array of x values from -6 to 6
x = np.arange(-2,2,0.001)

# Create y variables
y1 = x
y2 = x**2
y3 = x**3

# Use the plot function to plot the lines
plt.plot(x,y1,'b-',lw=3,label='$x$')
plt.plot(x,y2,'g--',lw=3,label='$x^2$')
plt.plot(x,y3,'m-.',lw=3,label='$x^3$')

# Add a title and axis labels
plt.title('Three functions')
plt.xlabel('x')
plt.ylabel('y')


# Add grid
plt.grid()

# legend
plt.legend(loc='lower right',ncol=3)


Out[9]:
<matplotlib.legend.Legend at 0x10e2dcc50>

Figures, axes, and subplots

Often we want to create plots with multiple axes or we want to modify the size and shape of the plot areas. To be able to do these things, we need to explicity create a figure and then create the axes within the figure. The best way to see how this works is by example.

Example: A single plot with double width

The default dimensions of a matplotlib figure are 6 inches by 4 inches. As we saw above, this leaves some whitespace on the right side of the figure. Suppose we want to remove that by making the plot area twice as wide.

Plot the sine function on -6 to 6 using a figure with dimensions 12 inches by 4 inches


In [10]:
# Create data
x = np.arange(-6,6,0.001)
y = np.sin(x)

# Create a new figure
fig = plt.figure(figsize=(12,4))

# Create axis
ax1 = fig.add_subplot(1,1,1)

# Plot
ax1.plot(x,y,lw=3,alpha = 0.6)

# Add grid
ax1.grid()


In the previous example the figure() function creates a new figure and add_subplot() puts a new axis on the figure. The command fig.add_subplot(1,1,1) means divide the figure fig into a 1 by 1 grid and assign the first component of that grid to the variable ax1.

Example: Two plots side-by-side

Create a new figure with two axes side-by-side and plot the sine function on -6 to 6 on the left axis and the cosine function on -6 to 6 on the right axis.


In [11]:
# Create data
x = np.arange(-6,6,0.001)
y1 = np.sin(x)
y2 = np.cos(x)

# Create a new figure
fig = plt.figure(figsize=(12,4))

# Create axis 1 and plot with title
ax1 = fig.add_subplot(1,2,1)
ax1.plot(x,y1,lw=3,alpha = 0.6)
ax1.grid()
ax1.set_xlabel('x')
ax1.set_ylabel('y')
ax1.set_title('sin')

# Create axis 2 and plot with title
ax2 = fig.add_subplot(1,2,2)
ax2.plot(x,y2,lw=3,alpha = 0.6)
ax2.grid()
ax2.set_xlabel('x')
ax2.set_ylabel('y')
ax2.set_title('sin')


Out[11]:
<matplotlib.text.Text at 0x10e3f20f0>

Example: Block of four plots

The default dimensions of a matplotlib figure are 6 inches by 4 inches. As we saw above, this leaves some whitespace on the right side of the figure. Suppose we want to remove that by making the plot area twice as wide.

Create a new figure with four axes in a two-by-two grid. Plot the following functions on the interval -2 to 2:

  • $y = x$
  • $y = x^2$
  • $y = x^3$
  • $y = x^4$

Leave the figure size at the default (6in. by 4in.) but run the command plt.tight_layout() to adust the figure's margins after creating your figure, axes, and plots.


In [15]:
# Create data
x = np.arange(-2,2,0.001)
y1 = x
y2 = x**2
y3 = x**3
y4 = x**4

# Create a new figure
fig = plt.figure()

# Create axis 1 and plot with title
ax1 = fig.add_subplot(2,2,1)
ax1.plot(x,y1,lw=3,alpha = 0.6)
ax1.grid()
ax1.set_xlabel('x')
ax1.set_ylabel('y')
ax1.set_title('$x$')

# Create axis 2 and plot with title
ax2 = fig.add_subplot(2,2,2)
ax2.plot(x,y2,lw=3,alpha = 0.6)
ax2.grid()
ax2.set_xlabel('x')
ax2.set_ylabel('y')
ax2.set_title('$x^2$')

# Create axis 3 and plot with title
ax3 = fig.add_subplot(2,2,3)
ax3.plot(x,y3,lw=3,alpha = 0.6)
ax3.grid()
ax3.set_xlabel('x')
ax3.set_ylabel('y')
ax3.set_title('$x^3$')

# Create axis 4 and plot with title
ax4 = fig.add_subplot(2,2,4)
ax4.plot(x,y4,lw=3,alpha = 0.6)
ax4.grid()
ax4.set_xlabel('x')
ax4.set_ylabel('y')
ax4.set_title('$x^4$')

# Adjust margins
plt.tight_layout()


Exporting figures to image files

Use the plt.savefig() function to save figures to images.


In [13]:
# Create data
x = np.arange(-6,6,0.001)
y = np.sin(x)

# Create a new figure, axis, and plot
fig = plt.figure()
ax1 = fig.add_subplot(1,1,1)
ax1.plot(x,y,lw=3,alpha = 0.6)
ax1.grid()

# Save
plt.savefig('fig_econ129_class04_sine.png',dpi=120)


In the previous example, the image is saved as a PNG file with 120 dots per inch. This resolution is high enough to look good even when projected on a large screen. The image format is inferred by the extension on the filename.