Imports



In [ ]:

    
# Panda will be usefull for quick data parsing
import pandas as pd
import numpy as np

# Small trick to get a larger display
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:90% !important; }</style>"))

Pyplot is the Matplotlib plotting backend and the inline magic to see the graph directly in the notebook



In [ ]:

    
import matplotlib.pyplot as pl
%matplotlib inline

Or you can use pylab, which simplifies all the calling to matplotlib and numpy a little



In [ ]:

    
import pylab as pl
%pylab inline

We can define a default size for all plots that will be generated by matplotlib



In [ ]:

    
pylab.rcParams['figure.figsize'] = (20,7)

Introduction to plotting with `matplotlib`

2D plotting library which produces high quality figures

Full integration in jupyter

Can generate plots, histograms, power spectra, bar charts, errorcharts, scatterplots, ... with just a few lines of code

For the power user, you have full control of line styles, font properties, axes properties, ...

See many examples of plots in pyplot gallery

The documentation of pyplot is extensive but a little hard to understand

Before we start

Many named colors are available (keyword: color)

As well as color palettes (keyword: colormap)

There are 4 different styles for lines (keyword: linestyle)

And many different marker types for plot points (keyword: marker)

pyplot also provides stylesheet to yield high quality rendering effortlessly

In Jupyter, we can change the default parameter with pl.rcparam



In [ ]:

    
pl.rcParams['figure.figsize'] = 20, 7
pl.rcParams['font.family'] = 'sans-serif'
pl.rcParams['font.sans-serif'] = ['DejaVu Sans']

the stylesheet can also be defined by default



In [ ]:

    
pl.style.available

Let's use ggplot style (R style) for this notebook



In [ ]:

    
pl.style.use('ggplot')

Line plot

Plot lines and/or markers to the Axes

Requires 2 lists of coordinates for the x and the y axis (OR only 1 list for the Y axis and X will be automatically created)



In [ ]:

    
# Create random datasets with numpy random module
x = np.arange(50)
y = np.random.rand(50)

#Plot y using default line style and color x is automatically inferred
pl.plot(y)

# Plot x and y without line and purple diamon markers
pl.plot(x, y+1, marker ='d', linewidth=0, color="purple")

# Plot x and y using dotted line and 
pl.plot(x, y+2, color = 'dodgerblue', linestyle='--')

# Plot x and y using blue circle markers
pl.plot(x, y+3, color='green', linewidth=2, marker='>', linestyle="-.")

# Plot x and y using blue circle markers
pl.plot(x, y+4, color='green', linewidth=4, marker='o', linestyle="-")

Scatter plot

Make a scatter plot of x vs y, where x and y are sequence-like objects of the same length.

Requires 2 lists of coordinates for the x and the y axis



In [ ]:

    
pl.scatter (np.random.randn(200),np.random.randn(200), color="coral")
pl.scatter (np.random.randn(100)+2,np.random.randn(100)+3, color="lightgreen")
pl.scatter (np.random.randn(100)-2,np.random.randn(100)*4, color="dodgerblue")

Bar plot

Make a bar plot with rectangles

Required a list of coordinates for the left side of the bars, a list of height, and the width of the bars

Now plot the data as a bar plot



In [ ]:

    
# Create random datasets with numpy random module
x = np.arange(10)

# If the x coordinates are similar the bar are merged at the same position
h1 = np.random.rand(10)
pl.bar(left=x, height=h1, width=0.2, color="dodgerblue")

# To create a stacked graph, the bottom position of the series need to correspond to the previous series
h2 = np.random.rand(10)
pl.bar(left=x, height=h2, bottom= h1, width=0.2, color="lightblue")

# Offset the x coordinate to add a new series and customize color and aspect
h3 = np.random.rand(10)
pl.bar(left=x+0.2, height=h3, width=0.2, color ='salmon', linewidth=2, edgecolor="red")

# Add yerr bars
h4 = np.random.rand(10)
pl.bar(left=x+0.4, height=h4, width=0.2, color ='green', yerr=np.random.randn(10)/10, ecolor="black")

Histogram

Compute and draw the histogram of x

Requires a list of values and a number of bins to split the data into

possible types of histogram to draw (histtype):

bar : a traditional bar-type histogram. If multiple data are given the bars are aranged side by side.
barstacked : a bar-type histogram where multiple data are stacked on top of each other.
step : a lineplot that is by default unfilled.
stepfilled : a lineplot that is by default filled.

The return value is a tuple containing the following:

n = The values of the histogram bins after eventual normalisation
bins = The edges of the bins
patches = List of individual patches used to create the histogram



In [ ]:

    
# Generate a list of 2* 1000 values following a normal distibution

n, bins, patches = pl.hist(x=x, bins=30, histtype='bar')
print (n)
print (bins)



In [ ]:

    
# Generate a list of 2* 1000 values following a normal distibution
# Contrary to the first plot, this time, series are stacked

x = np.random.randn(1000, 2)
n, bins, patches = pl.hist(x=x, bins=30, histtype='barstacked')



In [ ]:

    
# Generate a list of 1000 values following a normal distibution
# The plot is cummulative and step style

x = np.random.randn(1000)
n, bins, patches = pl.hist(x=x, bins=30, histtype='step', cumulative=True)



In [ ]:

    
# Generate a list of 2* 1000 values following a normal distibution
# The plot is rotated to horizontal orientation and represented in stepfilled style

x = np.random.randn(1000)
n, bins, patches = pl.hist(x=x, bins=30, histtype='stepfilled', orientation="horizontal")

Customize the plotting area

The plotting area can be customized easily as shown below



In [ ]:

    
# Size of the ploting area
pl.figure(figsize=(15,10))

# Customize X and Y limits
pl.xlim(-1,10)
pl.ylim(-0.5,1.5)

# Add X label, y label and a title
pl.xlabel("this is my x label", fontsize=15)
pl.ylabel("this is my Y label", fontsize=15)
pl.title("this is my title", fontsize=20)

# Add a grid
pl.grid(True, color="grey", linewidth=0.5, linestyle="--")

# finally plot the graphs
pl.plot(np.arange(10), np.random.rand(10), color="coral", marker=">", label = "series1")
pl.plot(np.arange(10), np.random.rand(10), color="dodgerblue", marker="<", label = "series2")

#Add the legend outside of the plotting area
pl.legend(bbox_to_anchor=(1, 1), loc=2, frameon=False, fontsize=15)

The figure area can also be divided to plot several graphs side by side with the subplot command



In [ ]:

    
pl.figure()

# First plot in the left half
pl.subplot(121)
pl.plot(np.arange(10), np.random.rand(10), label="1")
pl.plot(np.arange(10), np.random.rand(10), label="2")
pl.title("Series1")
pl.legend()

# First plot in the right half
pl.subplot(122)
pl.plot(np.arange(10), np.random.rand(10), label="3")
pl.plot(np.arange(10), np.random.rand(10), label="4")
pl.title("Series2")
pl.legend()



In [ ]:

    
pl.figure(figsize=(15,15))

# First plot in the top left corner
pl.subplot(221)
pl.plot(np.arange(10), np.random.rand(10))

# First plot in the top right corner
#pl.subplot(222)
#pl.plot(np.arange(10), np.random.rand(10))

# First plot in the bottom left corner
plt.subplot(223)
pl.plot(np.arange(10), np.random.rand(10))

# First plot in the bottom right corner
plt.subplot(224)
pl.plot(np.arange(10), np.random.rand(10))

Python plotting beyond Matplotlib

Seaborn is a Python visualization library based on matplotlib. It provides a high-level interface for drawing attractive statistical graphics.

Plotly is a collaborative browser-based plotting and analytics platform. You can generate graphs and analyze data from the in-browser

ggplot is a plotting system for Python based on R's ggplot2 and the Grammar of Graphics. It is built for making profressional looking, plots quickly with minimal code.