Library for "making attractive and informative statistical graphics in Python". It is built on top of matplotlib and not as a replacement. When using Seaborn it is common to still use a lot of standard matplotlib commands.
This tutorial "steals" parts from the Seaborn documentation.
In [ ]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
Let us first define a function plotting some sine waves so that we have something to plot.
In [ ]:
def sinplot(flip=1):
x = np.linspace(0, 14, 100)
for i in range(1, 7):
plt.plot(x, np.sin(x + i * .5) * (7 - i) * flip)
With pure matplotlib the plot looks somewhat "blunt":
In [ ]:
sinplot()
Seaborn makes it easy to get much nicer plots. All you have to do is import Seaborn:
In [ ]:
import seaborn as sns
sinplot()
Seaborn provides a number of different aesthetics that can be activated with the set function. These are defined by three things:
figsize to what the final output size should be. (Figsize is given in inches and the journals usually tell you what size your figures are allowed to be.) If you save your figures as PDF and include them in a LaTeX document you won't need to adjust the scaling anymore if you set it correctly with figsize.Let us try out some of the combinations:
In [ ]:
sns.set(context='paper')
sinplot()
In [ ]:
sns.set(style='whitegrid')
sinplot()
In [ ]:
sns.set(style='ticks')
sinplot()
In [ ]:
sns.set(palette='colorblind')
sinplot()
Styles etc. can also temporarilly applied within a with statement.
In [ ]:
with sns.axes_style('ticks'), sns.color_palette('colorblind'):
sinplot()
Usually such plots look better without the upper and right box boundaries. These can easily be removed with sns.despine().
In [ ]:
sns.set(style='ticks')
sinplot()
sns.despine()
Some plots look even better when offsetting the axes.
In [ ]:
sns.set(style='ticks')
sinplot()
sns.despine(offset=10)
In [ ]:
sns.set() # Restore Seaborn defaults
Seaborn not only improves the matplotlib styling, it also provides a number of (mainly statistical) plotting functions that make it easy to do a defined set of plots that are usually much more complicated in matplotlib. The following will just be a spotlight of what is offered by Seaborn. Refer to the documentation for the full glory.
The Seaborn plotting functions make have use of Pandas data frames and will use meta data from those frames to automatically label axes.
In [ ]:
x = np.random.normal(size=100)
In [ ]:
sns.distplot(x)
In [ ]:
sns.distplot(x, hist=False, rug=True)
In [ ]:
mean, cov = [0, 1], [(1, .5), (.5, 1)]
data = np.random.multivariate_normal(mean, cov, 200)
df = pd.DataFrame(data, columns=['x', 'y'])
In [ ]:
sns.jointplot('x', 'y', data=df)
In [ ]:
sns.jointplot('x', 'y', kind='hex', data=df)
In [ ]:
sns.jointplot('x', 'y', kind='kde', data=df)
In [ ]:
iris = sns.load_dataset('iris')
sns.pairplot(iris)
Mainly for exploratory analysis, to get actual quantitative measures use statsmodels.
In [ ]:
tips = sns.load_dataset('tips')
sns.lmplot('total_bill', 'tip', data=tips)
In [ ]:
sns.lmplot('size', 'tip', data=tips)
Some options to make such a plot of discrete values nicer:
In [ ]:
sns.lmplot('size', 'tip', data=tips, x_jitter=.05)
In [ ]:
sns.lmplot('size', 'tip', data=tips, x_estimator=np.mean)
More options for fitting polynomials, robust regression (against outliers), logistic regression, and more.
And easy to extent to more complex plots:
In [ ]:
sns.lmplot('total_bill', 'tip', hue='smoker', col='time', row='sex', data=tips, markers=['o', 'x'], palette='Set1')
In [ ]:
sns.stripplot('day', 'total_bill', data=tips)
In [ ]:
sns.stripplot('day', 'total_bill', data=tips, jitter=True)
In [ ]:
sns.swarmplot('day', 'total_bill', data=tips)
In [ ]:
sns.swarmplot('day', 'total_bill', hue='time', data=tips)
In [ ]:
sns.boxplot('day', 'total_bill', hue='sex', data=tips)
In [ ]:
sns.violinplot('day', 'total_bill', hue='sex', data=tips)
In [ ]:
sns.violinplot('day', 'total_bill', hue='sex', data=tips, split=True, inner='stick')
In [ ]:
sns.barplot('day', 'total_bill', hue='sex', data=tips)
In [ ]:
sns.pointplot('day', 'total_bill', hue='time', data=tips, dodge=True)
In [ ]:
g = sns.FacetGrid(tips, col='smoker', row='sex', margin_titles=True)
g.map(plt.scatter, 'total_bill', 'tip', marker='s')
for ax in g.axes.flat:
ax.plot((0, 50), (0, .2 * 50), c='.2', ls='--')
g.set(xlim=(0, 60), ylim=(0, 14))
In [ ]: