Introduction to Matplotlib

Visualisations are a very powerful way for humans to get inferences about data. It allows us to abstract huge amounts of information into easy digestible graphs.
Python has a wonderful tool called Matplotlib, which incidentally is inspired by Matlab's visualisation library. Let's begin with a few basic plots.

We will also start incorporating more and more data visualisations in the next two sections, so it's not restricted to just toy problems.


In [1]:
"""
We begin by using an inbuilt iPython Magic function to display plots 
within the window.
"""
%matplotlib inline
import matplotlib.pyplot as plt

In [2]:
import matplotlib
print(matplotlib.__version__)


2.0.0

import matplotlib.pyplot as plt is python convention.
If you want, you can potentially write import matplotlib.pyplot as chuck_norris as below.
'as plt' is the accepted convention though, and helps you write code with speed.

Reference Section

Colour/Color Codes

Colour Code Colour
r Red
b Blue
g Green
c Cyan
m Magenta
y Yellow
k Black
w White

Linestyle Codes

Linestyle Code Displayed Line Style
Solid Line
Dashed Line
: Dotted Line
-. Dash-Dotted Line
None No Connecting Lines

Marker Codes

Marker Code Marker Displayed
+ Plus Sign
. Dot
o Circle
^ Triangle
p Pentagon
s Square
x X Character
D Diamond
h Hexagon
* Asterisk

British v American Spellings

British spellings often give errors like these:

AttributeError: Unknown property colour

To be on the safer side, use color, unless if you're using R packages written by Hadley Wickham.

Style Guide

Line Plots


In [3]:
%matplotlib inline
import matplotlib.pyplot as chuck_norris

In [4]:
y = [1,2,3,4,5,4,3,2,1]
x = [2,4,6,8,10,12,10,8,6]

chuck_norris.plot(x, y, marker='D', linestyle='-.', color='m')
chuck_norris.plot([1,2,3,4,5,4,3,2,1], marker='^', linestyle='-', color='r')
chuck_norris.ylabel('Numbers')
#chuck_norris.show()


Out[4]:
<matplotlib.text.Text at 0x10ac64550>

So as you see, the convention plt can save you from typing chuck_norris every single time. Back to business though. Let's reimport matplotlib.

Another Line Plot


In [5]:
%matplotlib inline
import matplotlib.pyplot as plt

x = [1, 2, 3, 4, 5]
y = [1, 4, 9, 16, 25]
# We have two lists, or more in mathematical terms, arrays, x and y

In [6]:
plt.plot(x, y)


Out[6]:
[<matplotlib.lines.Line2D at 0x10e132d30>]

Let's break down what's happening.


In [7]:
# Import libraries
import matplotlib.pyplot as plt
%matplotlib inline

In [8]:
# Prepare the data
x = [1, 2, 3, 4, 5]
y = [1, 4, 9, 16, 25]

In [9]:
# Plot the data
plt.plot(x,y, label='Sales')

# Add a legend
plt.legend()

# Add more information
plt.xlabel('Adwords Spending (ZIM $)')
plt.ylabel('Monthly Sales (Oranges)')
plt.title('Effect of Adwords Spending on Monthly Sales')


Out[9]:
<matplotlib.text.Text at 0x10dfecb00>

But this is too small. Let's specify the size of the plot. Note that you set it once at the very top, right after you import your libraries, or keep varying it every time you want to plot a graph.


In [10]:
plt.rcParams["figure.figsize"] = (15,7)

In [11]:
# Plot the data
plt.plot(x, y, label='Sales')

# Add a legend
plt.legend()

# Add more information
plt.xlabel('Adwords Spending (ZIM $)')
plt.ylabel('Monthly Sales (Oranges)')
plt.title('Effect of Adwords Spending on Monthly Sales')


Out[11]:
<matplotlib.text.Text at 0x10e1e0e10>

More Parameters for Line Plots


In [12]:
%matplotlib inline
import matplotlib.pyplot as plt

y = [1,4,9,16,25,36,49,64,81,100]
x1 = [5,10,15,20,25,30,35,40,45,47]
x2 = [1,1,2,3,5,8,13,21,34,53]

In [13]:
plt.rcParams["figure.figsize"] = (15,7)
plt.plot(y,x1, marker='+', linestyle='--', color='b',label='Blue Shift')
plt.plot(y,x2, marker='o', linestyle='-', color='r', label='Red Shift')
plt.xlabel('Days to Election')
plt.ylabel('Popularity')
plt.title('Candidate Popularity')
plt.legend(loc='lower right')


Out[13]:
<matplotlib.legend.Legend at 0x10e920550>

Bar Plots


In [14]:
%matplotlib inline
import matplotlib.pyplot as plt
plt.rcParams["figure.figsize"] = (15,7)

# Declare Values
vals = [10, 5, 3, 5, 7,6]
xval = [1, 2, 3, 4, 5,6]
# Bar Plot
plt.bar(xval, vals)
plt.title('Sales per Executive')
plt.xlabel('ID Number')
plt.ylabel('Weekly Sales')


Out[14]:
<matplotlib.text.Text at 0x10e9578d0>

Histograms


In [15]:
import numpy as np
import matplotlib.pyplot as plt
% matplotlib inline
plt.rcParams["figure.figsize"] = (15,7)

Y = []
for x in range(0,1000000):
    Y.append(np.random.randn())

In [17]:
# Here 50 is the bin size. Try playing around with 10,100,200 etc and see how it effects the shape of the graph
plt.hist(Y, 500)
plt.title('Distribution of Random Numbers')


Out[17]:
<matplotlib.text.Text at 0x1122ce358>

Scatterplots


In [18]:
radius = [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0]
# We import the math library. 
# This can also be done as from math import pi
# Then instead of math.pi, we simply use pi
import math
import matplotlib.pyplot as plt
% matplotlib inline
plt.rcParams["figure.figsize"] = (15,7)

# How awesome is list comprehension!!
area = [round((r**2)*math.pi,2) for r in radius]
print(area)


[3.14, 12.57, 28.27, 50.27, 78.54, 113.1, 153.94, 201.06, 254.47, 314.16]

In [20]:
plt.xlabel('Radius')
plt.ylabel('Area')
plt.title('Radius of Circle v Area')
plt.scatter(radius, area, color='g', s=30)


Out[20]:
<matplotlib.collections.PathCollection at 0x113052e80>

Another Scatterplot Example


In [24]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
plt.rcParams["figure.figsize"] = (15,7)

x = np.random.randn(1, 500)
y = np.random.randn(1,500)
plt.scatter(x, y, color='b', s=50) # s = size of the point
plt.xlabel('X axis')
plt.ylabel('Y axis')
plt.title('Scatter Plot')


Out[24]:
<matplotlib.text.Text at 0x1138b58d0>

Grids


In [25]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
plt.rcParams["figure.figsize"] = (15,7)

fig = plt.figure()

# 121 = row,column,plot number
# Plot for Left Hand Side - 121 means 
imgage1 = fig.add_subplot(121)
N=500
x = np.random.randn(N)
y = np.random.randn(N)
colors = np.random.rand(N)
size =(20 * np.random.rand(N))**2 
plt.scatter(x, y, s=size, c=colors, alpha=0.4)

# Plot for Right Hand Side
imgage2 = fig.add_subplot(122)
N=1000
x1 = np.random.randn(N)
y1 = np.random.randn(N)
area= (5 * np.random.rand(N))**3 
colors = ['magenta', 'blue', 'black', 'yellow',]
plt.scatter(x1, y1, s=area, c=colors, alpha=0.6)
imgage2.grid(True)



In [26]:
%matplotlib inline
import matplotlib.pyplot as plt
plt.rcParams["figure.figsize"] = (15,7)

y = [1,4,9,16,25,36,49,64,81,100]
x1 = [5,10,15,20,25,30,35,40,45,47]
x2 = [1,1,2,3,5,8,13,21,34,53]

fig = plt.figure()
fig.suptitle("Candidate Popularity", fontsize="x-large")
# 121 = row,column,plot number
# Plot for Left Hand Side - 121 means 
imgage011 = fig.add_subplot(121)
plt.xlabel('Days to Election')
plt.plot(y,x1, marker='+', linestyle='--', color='b')

# Plot for Right Hand Side
imgage2 = fig.add_subplot(122)
plt.xlabel('Days to Election')
plt.plot(y,x2, marker='o', linestyle='-', color='r')
#imgage2.grid(True)


Out[26]:
[<matplotlib.lines.Line2D at 0x113e88978>]

In [27]:
## Alternate Method
%matplotlib inline
import matplotlib.pyplot as plt
plt.rcParams["figure.figsize"] = (15,7)

fig = plt.figure()
fig.suptitle("Candidate Popularity", fontsize="x-large")

ax1 = fig.add_subplot(121)
ax1.plot(y, x1, 'r-')
ax1.set_title("Candidate 1")

ax2 = fig.add_subplot(122)
ax2.plot(y, x2, 'k-')
ax2.set_title("Candidate 2")

plt.tight_layout()
fig = plt.gcf()


Saving Plots


In [28]:
%matplotlib inline
import matplotlib.pyplot as plt
plt.rcParams["figure.figsize"] = (15,7)

y = [1,4,9,16,25,36,49,64,81,100]
x1 = [5,10,15,20,25,30,35,40,45,47]
x2 = [1,1,2,3,5,8,13,21,34,53]

fig = plt.figure()
fig.suptitle("Candidate Popularity", fontsize="x-large")
# 121 = row,column,plot number
# Plot for Left Hand Side - 121 means 
imgage011 = fig.add_subplot(121)
plt.xlabel('Days to Election')
plt.plot(y,x1, marker='+', linestyle='--', color='b')

# Plot for Right Hand Side
imgage2 = fig.add_subplot(122)
plt.xlabel('Days to Election')
plt.plot(y,x2, marker='o', linestyle='-', color='r')
#imgage2.grid(True)

# Save Figure
plt.savefig("images/pop.png")

# Save Transparent Figure
plt.savefig("images/pop2.png", transparent=True)