Let's play with the Anscombe Quartet (a special, artificial data set). First import some plotting libaries again and make that the plots look nice


In [1]:
%pylab inline
import matplotlib.pyplot as plt
#use a nicer plotting style
plt.style.use(u'fivethirtyeight')
print(plt.style.available)

#change figure size
pylab.rcParams['figure.figsize'] = (10, 6)


/usr/local/lib/python2.7/site-packages/matplotlib/__init__.py:872: UserWarning: axes.color_cycle is deprecated and replaced with axes.prop_cycle; please use the latter.
  warnings.warn(self.msg_depr % (key, alt_key))
Populating the interactive namespace from numpy and matplotlib
[u'seaborn-darkgrid', u'seaborn-notebook', u'classic', u'seaborn-ticks', u'grayscale', u'bmh', u'seaborn-talk', u'dark_background', u'ggplot', u'fivethirtyeight', u'seaborn-colorblind', u'seaborn-deep', u'seaborn-whitegrid', u'seaborn-bright', u'seaborn-poster', u'seaborn-muted', u'seaborn-paper', u'seaborn-white', u'seaborn-pastel', u'seaborn-dark', u'seaborn-dark-palette']

Now let's import the data and use box plots to get an overview.


In [3]:
import pandas as pd
aq=pd.read_csv('data/anscombesQuartet.csv')

In [5]:
aq


Out[5]:
I_x I_y II_x II_y III_x III_y IV_x IV_y
0 10.0 8.04 10.0 9.14 10.0 7.46 8.0 6.58
1 8.0 6.95 8.0 8.14 8.0 6.77 8.0 5.76
2 13.0 7.58 13.0 8.74 13.0 12.74 8.0 7.71
3 9.0 8.81 9.0 8.77 9.0 7.11 8.0 8.84
4 11.0 8.33 11.0 9.26 11.0 7.81 8.0 8.47
5 14.0 9.96 14.0 8.10 14.0 8.84 8.0 7.04
6 6.0 7.24 6.0 6.13 6.0 6.08 8.0 5.25
7 4.0 4.26 4.0 3.10 4.0 5.39 19.0 12.50
8 12.0 10.84 12.0 9.13 12.0 8.15 8.0 5.56
9 7.0 4.82 7.0 7.26 7.0 6.42 8.0 7.91
10 5.0 5.68 5.0 4.74 5.0 5.73 8.0 6.89

In [ ]:
aq.describe()

In [ ]:
plot(aq['I_x'], aq['I_y'],'o')
figure()
plot(aq['II_x'], aq['II_y'],'o')
figure()
plot(aq['III_x'], aq['III_y'],'o')
figure()
plot(aq['IV_x'], aq['IV_y'],'o')

In [ ]:


In [ ]: