Plotly and Cufflinks for plotting with Pandas

Cufflinks integrates plotly with pandas to allow plotting right from pandas dataframes. Install using pip

pip install cufflinks

In [1]:
import numpy as np
import pandas as pd
import cufflinks as cf
from plotly.offline import download_plotlyjs, init_notebook_mode
from plotly.offline import plot, iplot

#set notebook mode
init_notebook_mode(connected=True)
cf.go_offline()


IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_data_rate_limit`.

In [2]:
df = pd.DataFrame(np.random.randn(100,4), 
                  columns='A B C D'.split(' '))
df.head()


Out[2]:
A B C D
0 -1.539048 -0.876097 -0.236866 -0.591233
1 -0.347391 0.584317 1.223430 0.269432
2 0.342396 -1.198989 0.692799 0.451392
3 -2.244973 0.979081 -0.896566 -0.114954
4 -1.358436 1.246352 1.182089 -1.072197

In [3]:
df2 = pd.DataFrame({'category':['A','B','C'], 'values':[33,56,67]})
df2


Out[3]:
category values
0 A 33
1 B 56
2 C 67

interactive plotting

line plots

With Plotly, you can turn on and off data values by clicking on the legend


In [4]:
df.iplot()


bar plot


In [5]:
df2.iplot(kind='bar')


box plot


In [6]:
df.iplot(kind='box')


surface plot


In [7]:
df3 = pd.DataFrame({'x':[1,2,3,4,5],
                   'y':[11,22,33,44,55],
                    'z':[5,4,3,2,1]})
df3


Out[7]:
x y z
0 1 11 5
1 2 22 4
2 3 33 3
3 4 44 2
4 5 55 1

In [8]:
df3.iplot(kind='surface')


histograms


In [9]:
df.iplot(kind='hist',bins=50)


spread plots

Used to show the spread in data value between two columns / variables.


In [12]:
df[['A','B']].iplot(kind='spread')


bubble scatter plots

same as scatter, but you can easily size the dots by another column


In [16]:
df.iplot(kind='bubble',x='A', y='B', size='C')


scatter matrix

This is similar to seaborn's pairplot


In [17]:
df.scatter_matrix()