Pandas' Built-in data visualization

Importing packages


In [4]:
import numpy as np
import pandas as pd
import seaborn as sns
%matplotlib inline

Loading CSV file into dataframe


In [8]:
df1 = pd.read_csv('df1', index_col=0)

Loading dataframe´s head items (5 top items)


In [9]:
df1.head()


Out[9]:
A B C D
2000-01-01 1.339091 -0.163643 -0.646443 1.041233
2000-01-02 -0.774984 0.137034 -0.882716 -2.253382
2000-01-03 -0.921037 -0.482943 -0.417100 0.478638
2000-01-04 -1.738808 -0.072973 0.056517 0.015085
2000-01-05 -0.905980 1.778576 0.381918 0.291436

Loading df2 file


In [10]:
df2 = pd.read_csv('df2')

Displaying dataframe´s head


In [11]:
df2.head()


Out[11]:
a b c d
0 0.039762 0.218517 0.103423 0.957904
1 0.937288 0.041567 0.899125 0.977680
2 0.780504 0.008948 0.557808 0.797510
3 0.672717 0.247870 0.264071 0.444358
4 0.053829 0.520124 0.552264 0.190008

Calling dataframe´s bult-in visualization methods to create a histogram


In [12]:
df1['A'].hist(bins=20)


Out[12]:
<matplotlib.axes._subplots.AxesSubplot at 0xb5a0358>

In [15]:
#df1['A'].plot(kind='hist', bins=30)
df1['A'].plot.hist()


Out[15]:
<matplotlib.axes._subplots.AxesSubplot at 0x9af9da0>

Creating a stacked bar visualization from dataframe´s columns


In [20]:
#df2.plot.area()
df2.plot.bar(stacked=True)


Out[20]:
<matplotlib.axes._subplots.AxesSubplot at 0x9e05f98>

Line plot from df1 data, showing 'B' column


In [24]:
df1.plot.line(x=df1.index, y='B', figsize=(12, 3), lw=1)


Out[24]:
<matplotlib.axes._subplots.AxesSubplot at 0xd013588>

Creating a scatter plot for A & B


In [28]:
#df1.plot.scatter(x='A', y='B', c='C', cmap='coolwarm') 
df1.plot.scatter(x='A', y='B', s=df1['C'] * 100)


Out[28]:
<matplotlib.axes._subplots.AxesSubplot at 0xbe45198>

Other pandas visualization built-in funcitions, such as boxplots, hexbin plot and kde (Kernel Density Estimator) plot


In [29]:
df2.plot.box()


Out[29]:
<matplotlib.axes._subplots.AxesSubplot at 0xeb8e3c8>

In [30]:
df = pd.DataFrame(np.random.randn(1000,2), columns=['a','b'])
df.head()


Out[30]:
a b
0 1.754893 -1.456810
1 0.497010 -0.006984
2 -0.032073 0.871510
3 0.717830 -0.373297
4 -2.319448 0.150371

In [32]:
df.plot.hexbin(x='a', y='b', gridsize=25)


Out[32]:
<matplotlib.axes._subplots.AxesSubplot at 0xed12cc0>

In [35]:
#df2['a'].plot.kde()
#df2['a'].plot.density()
df2.plot.kde()


Out[35]:
<matplotlib.axes._subplots.AxesSubplot at 0xee751d0>

In [ ]: