Watch Me Code 1: Matplotlib

We will demonstrate Pythons data visualization library Matplotlib using it two ways

  • Standalone
  • With Pandas

In [5]:
# Jupyter Directive
%matplotlib inline 

# imports
import matplotlib
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

matplotlib.rcParams['figure.figsize'] = (20.0, 10.0) # larger figure size

Manual Plotting in Matplotlib


In [6]:
# Matplotlib requires lists to plot
x = [1,2,3,4,5]
xsquared = [1,4,9,16,25]
plt.plot(x,xsquared) # default is a blue line


Out[6]:
[<matplotlib.lines.Line2D at 0x223d95b1da0>]

In [ ]:
# this can be overridden. consult help(plt.plot) for details
plt.plot(x, xsquared, 'ro') # red dots

In [ ]:
# we can manipulate the axis too, rather than auto scale. In this case we must call plt.show() to display the plot
plt.plot(x, xsquared, 'ro') # red dots
plt.axis([0,6,0,26]) # a list in the form [xmin, xmax, ymin, ymax]
plt.show()

In [ ]:
# Labels are simple
plt.plot(x, xsquared,'r--') # red dashes
plt.axis([0,6,0,26]) # a list in the form [xmin, xmax, ymin, ymax]
plt.xlabel("Value of X", fontsize=14)
plt.ylabel("Value of X Squared", fontsize=14)
plt.title("Plot of X versus X Squared", fontsize=20)
plt.grid(True)
plt.show()

Plotting chart types


In [ ]:
plt.bar(x,xsquared)

In [ ]:
plt.pie(x)

In [ ]:
plt.scatter(x, xsquared)

Plotting with Pandas


In [ ]:
scores = pd.read_csv("https://raw.githubusercontent.com/mafudge/datasets/master/exam-scores/exam-scores.csv")
scores.sample(10)

In [ ]:
# Plotting with Pandas is a bit more expressive
scores.plot.scatter(x ='Completion Time', y ='Student Score' )

In [ ]:
## Labels too small, we can fall back to Matplot lib!
p = scores.plot.scatter(x ='Completion Time', y ='Student Score', fontsize=20)
p.set_xlabel('Completetion Time', fontsize=20)
p.set_ylabel('Student Score', fontsize=20)
p

In [ ]:
# Take the value counts of letter grade and create a data frame
letter_grades = pd.DataFrame( { 'Letter' : scores['Letter Grade'].value_counts() } ).sort_index()
letter_grades.plot.bar(sort_columns=True)

In [ ]:
letter_grades.plot.pie( y = 'Letter', fontsize = 20)