# Some examples of Seaborn visualizations

This notbeook uses the code from the book Python Data Science Handbook, by Jake VanderPlas

``````

In [1]:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

%matplotlib inline

``````
``````

In [2]:

# Random data
rng = np.random.RandomState(0)
x = np.linspace(0, 10, 500)
y = np.cumsum(rng.randn(500, 6), 0)

# 1. Plot the data with Matplotlib defaults
plt.plot(x, y)
plt.legend('ABCDEF', ncol=2, loc='upper left');

``````
``````

``````
``````

In [3]:

# 2. Now let's see what Seaborn can do
import seaborn as sns
sns.set()

# same data defined above (x, y)
plt.plot(x, y)
plt.legend('ABCDEF', ncol=2, loc='upper left');

``````
``````

``````

## Exploring Seaborn Plots

### Histograms, KDE, and densities

``````

In [4]:

data = np.random.multivariate_normal([0, 0], [[5, 2], [2, 2]], size=2000)
data = pd.DataFrame(data, columns=['x', 'y'])

for col in 'xy':
plt.hist(data[col], normed=True, alpha=0.5)

``````
``````

``````

Now a smooth estimate of the distribution using a kernel density estimation, which Seaborn does with sns.kdeplot:

``````

In [5]:

for col in 'xy':

``````
``````

``````

Histograms and KDE can be combined using distplot:

``````

In [6]:

sns.distplot(data['x'])
sns.distplot(data['y']);

``````
``````

``````

### Pair plots

When you generalize joint plots to datasets of larger dimensions, you end up with pair plots. This is very useful for exploring correlations between multidimensional data, when you'd like to plot all pairs of values against each other.

``````

In [7]:

``````
``````

Out[7]:

text-align: right;
}

text-align: left;
}

.dataframe tbody tr th {
vertical-align: top;
}

sepal_length
sepal_width
petal_length
petal_width
species

0
5.1
3.5
1.4
0.2
setosa

1
4.9
3.0
1.4
0.2
setosa

2
4.7
3.2
1.3
0.2
setosa

3
4.6
3.1
1.5
0.2
setosa

4
5.0
3.6
1.4
0.2
setosa

``````

Now, sns.pairplot visualization:

``````

In [8]:

sns.pairplot(iris, hue='species', size=2.5);

``````
``````

``````

### Faceted histograms

``````

In [9]:

# "Tips" dataset

``````
``````

Out[9]:

text-align: right;
}

text-align: left;
}

.dataframe tbody tr th {
vertical-align: top;
}

total_bill
tip
sex
smoker
day
time
size

0
16.99
1.01
Female
No
Sun
Dinner
2

1
10.34
1.66
Male
No
Sun
Dinner
3

2
21.01
3.50
Male
No
Sun
Dinner
3

3
23.68
3.31
Male
No
Sun
Dinner
2

4
24.59
3.61
Female
No
Sun
Dinner
4

``````
``````

In [10]:

tips['tip_pct'] = 100 * tips['tip'] / tips['total_bill']

grid = sns.FacetGrid(tips, row="sex", col="time", margin_titles=True)
grid.map(plt.hist, "tip_pct", bins=np.linspace(0, 40, 15));

``````
``````

``````

### Bar Plots

``````

In [11]:

``````
``````

Out[11]:

text-align: right;
}

text-align: left;
}

.dataframe tbody tr th {
vertical-align: top;
}

method
number
orbital_period
mass
distance
year

0
1
269.300
7.10
77.40
2006

1
1
874.774
2.21
56.95
2008

2
1
763.000
2.60
19.84
2011

3
1
326.030
19.40
110.62
2007

4
1
516.220
10.50
119.47
2009

``````
``````

In [12]:

with sns.axes_style('white'):
g = sns.factorplot("year", data=planets, aspect=2,
kind="count", color='steelblue')
g.set_xticklabels(step=5)

``````
``````

``````

More options:

``````

In [13]:

with sns.axes_style('white'):
g = sns.factorplot("year", data=planets, aspect=4.0, kind='count',
hue='method', order=range(2001, 2015))
g.set_ylabels('Number of Planets Discovered')

``````
``````

``````

Simple graphic bar:

``````

In [14]:

``````
``````

In [15]:

sns.countplot(x="deck", data=titanic, palette="Greens_d");

``````
``````

``````
``````

In [ ]:

``````