Matplotlib

Introduction

Matplotlib is a Python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms. Matplotlib can be used in Python scripts, the Python and IPython shell, the jupyter notebook, web application servers, and four graphical user interface toolkits.

Installation

sudo pip3 install matplotlib

In [12]:
import matplotlib.pyplot as plt
import pandas as pd

You'll also need to use this line to see plots in the notebook:


In [2]:
%matplotlib inline

That line is only for jupyter notebooks, if you are using another editor, you'll use: plt.show() at the end of all your plotting commands to have the figure pop up in another window.

Using global functions

  • plt.bar – creates a bar chart
  • plt.scatter – makes a scatter plot
  • plt.boxplot – makes a box and whisker plot
  • plt.hist – makes a histogram
  • plt.plot – creates a line plot

In [3]:
import numpy as np
x = np.linspace(0, 5, 11)
y = x ** 2

In [4]:
plt.plot(x, y ,'r--')
#plt.plot(x, y ,'b--')


Out[4]:
[<matplotlib.lines.Line2D at 0x7fc3de779e48>]

In [5]:
x = np.arange(5)        
y = (20, 35, 30, 35, 27)
plt.bar(x,y)


Out[5]:
<Container object of 5 artists>

In [6]:
import numpy as np
x = np.linspace(0, 5, 11)
y = x ** 2
plt.scatter(x,y)


Out[6]:
<matplotlib.collections.PathCollection at 0x7fc3de6bb518>

In [7]:
plt.subplot(1,2,1)
plt.plot(x, y, 'r--')
plt.subplot(1,2,2)
plt.plot(y, x, 'g*-');



In [8]:
data = [np.random.normal(0, std, 100) for std in range(1, 4)]
plt.boxplot(data,vert=True,patch_artist=True);



Matplotlib Object Oriented Method

Now that we've seen the basics, let's break it all down with a more formal introduction of Matplotlib's Object Oriented API. This means we will instantiate figure objects and then call methods or attributes from that object.


In [13]:
X = pd.read_csv('Datasets/matplot/X.csv')
Y = pd.read_csv('Datasets/matplot/Y.csv')

In [15]:
import matplotlib.patches as mpatches

In [21]:
fig, axes = plt.subplots(1, 2, figsize=(20,8))

axes[0].scatter(X['ENSG00000141448.7'],X['ENSG00000178401.13'], c = Y['0'] , marker='o',cmap = 'jet',s=30)
axes[1].scatter(X['ENSG00000006611.14'],X['ENSG00000106078.16'], c = Y['0'] , marker='o', cmap='jet', s = 30)
## add details
axes[0].legend(handles=[mpatches.Patch(color='red', label='bladder'),mpatches.Patch(color='green', label='colorectal'),mpatches.Patch(color='blue', label='pancreas')],fontsize = 15)
axes[1].legend(handles=[mpatches.Patch(color='red', label='bladder'),mpatches.Patch(color='green', label='colorectal'),mpatches.Patch(color='blue', label='pancreas')],fontsize = 15)
#labelxy
axes[0].set_xlabel('feature1')
axes[0].set_ylabel('feature2')
axes[1].set_xlabel('feature1')
axes[1].set_ylabel('feature2')
#name of figure.
axes[0].set_title("figure 1.2")
axes[1].set_title("figure 1.3")


Out[21]:
Text(0.5,1,'figure 1.3')

In [23]:
import seaborn as sns

Data

Seaborn comes with built-in data sets!


In [37]:
tips = sns.load_dataset('tips')

tips.head()


Out[37]:
total_bill tip sex smoker day time size
0 16.99 1.01 Female No Sun Dinner 2
1 10.34 1.66 Male No Sun Dinner 3
2 21.01 3.50 Male No Sun Dinner 3
3 23.68 3.31 Male No Sun Dinner 2
4 24.59 3.61 Female No Sun Dinner 4

distplot

The distplot shows the distribution of a univariate set of observations.


In [50]:
sns.distplot(tips['total_bill'],bins=30)


Out[50]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fc3c67c5cf8>

In [ ]:

To remove the kde layer and just have the histogram use:


In [55]:
sns.distplot(tips['total_bill'],kde=False,bins=30)


Out[55]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fc3c6634898>

jointplot() allows you to basically match up two distplots for bivariate data. With your choice of what kind parameter to compare with:

  • “scatter”
  • “reg”
  • “resid”
  • “kde”
  • “hex”

In [40]:
sns.jointplot(x='total_bill',y='tip',data=tips,kind='scatter')


Out[40]:
<seaborn.axisgrid.JointGrid at 0x7fc3c703aa58>

In [45]:
sns.jointplot(x='total_bill',y='tip',data=tips,kind='hex')


Out[45]:
<seaborn.axisgrid.JointGrid at 0x7fc3c6a09860>

In [57]:
sns.jointplot(x='total_bill',y='tip',data=tips,kind='reg')


Out[57]:
<seaborn.axisgrid.JointGrid at 0x7fc3c6645eb8>

pairplot

pairplot will plot pairwise relationships across an entire dataframe (for the numerical columns) and supports a color hue argument (for categorical columns).


In [43]:
sns.pairplot(tips)


Out[43]:
<seaborn.axisgrid.PairGrid at 0x7fc3c70402b0>

In [44]:
sns.pairplot(tips,hue='sex',palette='coolwarm')


Out[44]:
<seaborn.axisgrid.PairGrid at 0x7fc3c72f7710>