Make data visualization with Seaborn

Why seaborn?

In previous machine learning examples, we have already seen some nice plots and visulizaions which will definetely help us. In this exercise, we could practice some other visualization techniques.

This time we will still play with Iris data. Please get familar with that data first. In the exercise, you could learn how to make plots such as bloxplot, pairplot and some other forms of data visualizations. If you don't know what does it exactly mean, please search material online.


In [8]:
# please watch out how we import seaborn package and how we rename it as sns
import seaborn as sns

In [6]:
import pandas as pd
# read CSV file directly from a URL and save the results
iris = pd.read_csv('https://raw.githubusercontent.com/uiuc-cse/data-fa14/gh-pages/data/iris.csv')

In [3]:
iris.head()


Out[3]:
sepal_length sepal_width petal_length petal_width species
0 5.1 3.5 1.4 0.2 setosa
1 4.9 3.0 1.4 0.2 setosa
2 4.7 3.2 1.3 0.2 setosa
3 4.6 3.1 1.5 0.2 setosa
4 5.0 3.6 1.4 0.2 setosa

In [4]:
iris["species"].value_counts()


Out[4]:
versicolor    50
setosa        50
virginica     50
Name: species, dtype: int64

Joint Plot


In [7]:
# This is not a waste line, please search information why we need this.
%matplotlib inline

In [10]:
# This is a joint plot for two variables sepal_length and sepal_width
sns.jointplot(x ="sepal_length", y = "sepal_width", data =iris, size= 5)


Out[10]:
<seaborn.axisgrid.JointGrid at 0x155ca6fc160>

Problem 1. Please make another joint plot for petal_length and petal_wdith


In [ ]:
# Your code and then you run it:

Facet grid Plots


In [11]:
import matplotlib.pyplot as plt
sns.FacetGrid(iris, hue = "species", size =5)\
    .map(plt.scatter, "sepal_length", "sepal_width")\
    .add_legend()


Out[11]:
<seaborn.axisgrid.FacetGrid at 0x155ca904f28>

Boxplot


In [12]:
ax = sns.boxplot(x = "species", y = "petal_length", data = iris)


Problem 2. Please make another box plot to compare petal width for different speicies.


In [ ]:
# Plese put your code here and then run it

Optional problem.

Please read http://seaborn.pydata.org/generated/seaborn.pairplot.html. Make a pairplot for all the variables and make the type as scatter.


In [ ]:
# Code here and run it.