In [2]:
%qtconsole
In [3]:
%matplotlib inline
This notebook presents plotting examples on the famous iris data set by using the Grammar of Graphics implemented in ggplot package. This package is good for those users coming from R, because of its design goal:
The goal is to have no difference other than those necessary due to the differences between R and Python.
In [4]:
from ggplot import *
import pandas as pd
from sklearn import datasets
In [5]:
# import iris data
iris = datasets.load_iris()
df1 = pd.DataFrame(iris.data, columns = iris.feature_names)
df2 = pd.DataFrame(iris.target_names[iris.target])
df = pd.concat([df1, df2], axis = 1)
df.head()
Out[5]:
In [6]:
df.columns = ['sl', 'sw', 'pl', 'pw', 'species']
Using two continious variables sl and sw within the class labels in species variable, one can see whether there are class differences in the 2D space (via scatter plot).
In [10]:
p1 = ggplot(aes(x = 'sl', y = 'sw', color = 'species'), data = df) + geom_point()
p1
Out[10]:
The plot shows that setosa class can be linearly separated from other two classes.
In [11]:
p2 = ggplot(aes(x = 'sl', y = 'sw', group = 'species', color = 'species'), data = df) + \
geom_point() + geom_smooth(alpha = 0.5) + theme_bw()
p2
Out[11]:
In [36]:
p3 = ggplot(aes(x = 'sl', y = 'sw', color = 'species'), data = df[df.species != 'setosa']) + \
geom_point() + theme_538()
p3
Out[36]:
In [12]:
p3 = ggplot(aes(x = 'sl'), data = df) + geom_histogram() + facet_wrap('species', ncol = 1)
p3
Out[12]:
In [ ]: