A good sketch is better than a long speech.


-Napoleon Bonaparte


What is Matplotlib?

A Python-based plotting library:

  • With support for a variety of different types of graphs
  • Using a MATLAB-like API
  • With support for integration with the IPython Notebook
  • That is used by Pandas for easy graphing operations

What kind of 2 and 3 dimensional graphs are covered?

  • Line
  • Area
  • Bar (vertical, horizontal, stacked)
  • Pie
  • Scatter
  • Boxplot
  • Polarplot
  • Hexbins
  • Heatmaps
  • Any arbitrary combination of the above

How will we use Matplotlib?

If you're a beginner, you'll probably only be using matplotlib via Pandas. This way plotting can be as simple as a single command.


In [ ]:
# Import matplotlib
import matplotlib

# Import pandas
import pandas as pd

# Tell matplotlib to plot in this window instead of a separate window.
%matplotlib inline

# Load data into dataframe (we will get to this later)
df = pd.read_csv('data/simple.csv')

# Plot data as line plot using simple function.
df.plot()

In [ ]:
# Tip: the 538 blog-like styling looks nicer.
matplotlib.style.use('fivethirtyeight')

# Plot as histogram using slightly more complicated function.
df.plot.hist(bins=30)

In [ ]:
# Because it's matplotlib-based, you can feed in matplotlib options
df.plot(linestyle='--', marker='o', color='red', linewidth=.50)

In [ ]:
# If it's easier, df.plot() gives a matplotlib axes object for customization
axes = df.plot()
axes.annotate('Manual Annotation', (10, 40))

# We can also do this programatically:
for number in [10, 20, 30, 40, 50]:
    axes.annotate('Auto-Annotation', (number, number))

As you get more experienced you may want to use matplotlib directly:

Where matplotlib gets really fancy is when you write your own graphs from scratch. You can also use a higher-level library like seaborn.

While you're getting started, you'll want to limit use:

  • matplotlib.pyplot (often imported as plt) as your toolbox
  • Your figure which you can think of as your canvas and everything else associated with your canvas, like axes.
  • Your axes which you can think of as the X and Y axes that tie your data to your plot.

Note: at any one time you are only working on a single figure. Things get considerably more complicated when working on multiple figures. plt.gca() (get current axis) and plt.gcf() (get current figure) are helpful.


In [ ]:
# Import plottting because we're not using pandas
import matplotlib.pyplot as plt

# Clear any existing figures to be safe.
plt.clf()

# Create figure and axis
fig, axes = plt.subplots()

# plot scatter (s is size)
a = axes.scatter([1, 2, 3, 4], [1, 4, 9, 16], s=50, label='My Scatter')

# Add an arbitrary line
axes.plot([1, 2, 3, 4], [1, 2, 3, 4], label='My Line')

In [ ]:
# Get the items we've already plotted
lines = axes.get_lines()

# We can get children of axes to reference elements
children = axes.get_children()

# In this case, scatter is child 0
scatter_points = children[0] # this is how we index objects.
scatter_points.set_color('red')

# In this case, the line is 1
line = children[1]
line.set_linestyle('-.')

# Make background white
axes.patch.set_facecolor('white')
fig.patch.set_facecolor('white')

# Print children for reference
for child in children:
    print(child)

fig

In [ ]:
# Zoom out
axes.set_xbound(0, 10)

# Set labels
axes.set_xticklabels(['Small', 'Medium', 'Large'])
axes.set_xlabel('Size')

# Set title
axes.set_title('Graphing Stuff')

# Set legend
axes.legend()

# Save figure to PNG
fig.savefig('data/output.png')

fig

In [ ]:
# We already did this, but just for completeness sake
import matplotlib
%matplotlib inline 

# Import the actual plotting tool (pyplot)
from matplotlib import pyplot as plt

# Import numpy
import numpy as np

plt.gcf()


with plt.xkcd():

    fig = plt.figure()
    ax = fig.add_subplot(1, 1, 1)
    ax.bar([-0.125, 1.0-0.125], [25, 100], 0.25)
    ax.spines['right'].set_color('none')
    ax.spines['top'].set_color('none')
    ax.xaxis.set_ticks_position('bottom')
    ax.set_xticks([0, 1])
    ax.set_xlim([-0.5, 1.5])
    ax.set_ylim([0, 110])
    ax.set_xticklabels(['Intuitively\nExplains Complex\nRelationships',
                        'People Expect\nShiny, Science-y\nData Thingies'])
    plt.yticks([])

    plt.title("Reasons to Make a Graph @ U.S. Bank")

    font = {'size' : 18}

    matplotlib.rc('font', **font)

    im = plt.imread("static/small.jpg")
    implot = plt.imshow(im, aspect='auto', extent=[.1,  1, 55, 75], alpha=.25)

    fig.patch.set_facecolor('white')
    ax.patch.set_facecolor('white')

    plt.show()

Note: if you're looking for something more interactive/Tableau-like, see bokeh.

Note: Matplotlib plots points and not formulae.

If you have a formula, create a range of points via np.arange() or np.linspace(), and then graph your points vs. the output of the formula.


Additional Learing Resources


Next Up: Numpy