Although the aesthetics of a figure are not sufficient to convey useful statistical information, attractive plots are more pleasant to look at when trying to understand your own data, and they can draw in your audience when you present it. Judicious use of stylistic detail, such as maintaining thematic colors across figures in a presentation, can support the communication of your ideas without distracting from the central message. While a beautiful color palette cannot save an incoherent plot, poor stylistic choices can obscure or mislead about patterns in your data.
Motivted by these considerations, seaborn tries to make it easy to control the look of your figures. This notebook walks through the set of tools that let you manipulate plot styles.
In [1]:
import numpy as np
from scipy import stats
import matplotlib as mpl
import matplotlib.pyplot as plt
np.random.seed(9221999)
Let's define a simple function to plot some offset sine waves to help us see the different stylistic parameters we can tweak.
In [2]:
def sinplot(flip=1):
x = np.linspace(0, 14, 100)
for i in range(1, 7):
plt.plot(x, np.sin(x + i * .5) * (7 - i) * flip)
This is what the plot looks like with matplotlib defaults:
In [3]:
sinplot()
To switch to seaborn defaults, simply import the package.
In [4]:
import seaborn as sns
sinplot()
Seaborn plots break from the MATLAB inspired aesthetic of matplotlib to plot in more muted colors over a light gray background with white grid lines. We find that the grid aids in the use of figures for conveying quantitative information -- in almost all cases, figures should be preferred to tables. The white-on-gray grid that is used by default avoids being obtrusive. The grid is particularly useful in giving structure to figures with multiple facets, which is central to some of the more complex tools in the library:
In [5]:
f, (ax1, ax2) = plt.subplots(2, 1, sharex=True)
plt.subplot(ax1)
sinplot()
plt.subplot(ax2)
sinplot(-1)
plt.tight_layout()
There are two other basic styles. One keeps the grid, but plots with a more traditional white background.
In [6]:
sns.set(style="whitegrid")
sinplot()
For this kind of plot, where the data are represented with lines, the gray grid complicates the figure and probably detracts more than it adds. However, many kinds of statistical plots give more weight to the foreground and look fine with the whitegrid style:
In [7]:
x = np.linspace(0, 14, 100)
y1 = np.sin(x + .5)
y2 = np.sin(x + 4 * .5) * 3
c1, c2 = sns.color_palette("deep", 2)
plt.plot(x, y1)
plt.fill_between(x, y1 - .5, y1 + .5, color=c1, alpha=.2)
plt.plot(x, y2)
plt.fill_between(x, y2 - .8, y2 + .8, color=c2, alpha=.2);
In [8]:
data = 1 + np.random.randn(20, 6)
sns.boxplot(data);
In [9]:
pos = np.arange(6) + .6
h = data.mean(axis=0)
err = data.std() / np.sqrt(len(data))
plt.bar(pos, h, yerr=err, color=sns.husl_palette(6, s=.75), ecolor="#333333");
You can also turn off the grid altogether, which is closest to the default matplotlib style
In [10]:
sns.set(style="nogrid")
sinplot()
Because of the way matplotlib figures work, the axis spines cannot be turned off as part of a default style. However, there is a convenience function in seaborn for stripping the top and right spines to open up the plot.
In [11]:
plt.bar(pos, h, yerr=err, color=sns.husl_palette(6, s=.75), ecolor="#333333")
sns.despine()
To manipulate the look of more complex figures, you can use the optional arguments to despine
.
In [12]:
sns.regplot(*np.random.randn(2, 100))
main, x_marg, y_marg = plt.gcf().axes
sns.despine(ax=main)
sns.despine(ax=x_marg, left=True)
sns.despine(ax=y_marg, bottom=True)
There is also the ticks
style, which is like nogrid
but with small ticks to help give structure to the plot. It's expected that this style will be used in conjunction with despine()
, otherwise the plot looks a little...hairy.
In [13]:
sns.set(style="ticks")
sns.boxplot(data)
sns.despine()
The seaborn defaults are tailored to make plots that are well-proportioned for vieweing on your own computer screen. There are a few other styles that try to set parameters like font sizes to be more appropriate for other settings, such as at a talk or on a poster:
In [14]:
sns.set(style="darkgrid", context="talk")
sns.boxplot(data)
plt.title("Score ~ Category");
sns.axlabel("Category", "Score")
In [15]:
sns.set(style="nogrid", context="poster")
sns.boxplot(data)
plt.title("Score ~ Category");
sns.axlabel("Category", "Score")
sns.despine()
I would expect both the specific elements of these styles and the API for specifying them to change somewhat as the package matures. In particular, there is not currently a way for seaborn to respect rc parameters that conflict with those it sets itself. Additionally, there is no support for custom themes. If you would find these features useful for your own work, please get in touch.
Let's reset the default styles.
In [16]:
sns.set()
Considerable effor has been invested in a simple yet uniform interface for creating and specifying color palettes, as color is one of the most important (and also one of the most tricky) aspects of making clear and informative plots.
The default color scheme is based on the matplotlib default while aiming to be a bit more pleasant to look at. To grab the current color cycle, call the color_palette
function with no arguments. This just returns a list of r, g, b tuples:
In [17]:
current_palette = sns.color_palette()
current_palette
Out[17]:
Seaborn has a small function to visualize a palette, which is useful for documentation and possibly for when you are choosing colors for your own plots.
In [18]:
sns.palplot(current_palette)
It's also easy to get evenly spaced hues in the husl
or hls
color spaces. The former is preferred for its perceptual uniformity, although the individual colors can be relatively less attractive than their brighter versions in the latter.
In [19]:
sns.palplot(sns.color_palette("husl", 8))
In [20]:
sns.palplot(sns.color_palette("hls", 8))
You can also use the name of any matplotlib colormap, and the palette will return evenly-spaced samples from points near the extremes.
In [21]:
sns.palplot(sns.color_palette("coolwarm", 7))
Palettes can be broadly categorized as diverging (as is the palette above), sequential, or qualitative. Diverging palettes are useful when the data has a natural, meaninfgul break-point. Sequential palettes are better when the data range from "low" to "high" values.
In [22]:
sns.palplot(sns.color_palette("RdPu_r", 8))
Categorial data is best represented by a qualitative palette. Seaborn fixes some problems inherent in the way matplotlib deals with the qualitative palettes from the colorbrewer package, although they behave a little differently. If you request more colors than exist for a given qualitative palette, the colors will cycle, which is not the case for other matplotlib-based palettes.
In [23]:
sns.palplot(sns.color_palette("Set2", 10))
Finally, you can just pass in a list of color codes to specify a custom palette.
In [24]:
sns.palplot(sns.color_palette(["#8C1515", "#D2C295"], 5))
Many seaborn functions use the color_palette
function behind the scenes, and thus accept any of the valid arguments for their color
or palette
parameter.
In [25]:
sns.violinplot(data, inner="points", color="Set3");
Two other functions allow you to create custom palettes. The first takes a color and creates a blend to it from a very dark gray.
In [26]:
sns.palplot(sns.dark_palette("MediumPurple"))
Note that the interpolation that is done behind the scenes is not currently performed in a color space that is compatible with human perception, so the increments of color in these palettes will not necessarily appear uniform.
In [27]:
sns.palplot(sns.dark_palette("skyblue", 8, reverse=True))
By default you just get a list of colors, like any other seaborn palette, but you can also return the palette as a colormap object that can be passed to matplotlib functions.
In [28]:
sample = np.random.multivariate_normal([0, 0], [[1, -.5], [-.5, 1]], size=100)
pal = sns.dark_palette("palegreen", as_cmap=True)
plt.figure(figsize=(6, 6))
sns.kdeplot(sample, cmap=pal);
There's a related trick embedded in palette production that allows you to specify the name of a sequential ColorBrewer palette with a "_d"
suffix. This will create a palette that is harmonious with the base palette, but using colors that are dark enough to draw line or contour plots.
In [29]:
sns.palplot(sns.color_palette("BuPu_d"))
In [30]:
sns.kdeplot(sample[:, (1, 0)], cmap="BuPu_d");
A more general function for making custom palettes interpolates between an arbitrary number of seed points. You could use this to make your own diverging palette.
In [31]:
sns.palplot(sns.blend_palette(["mediumseagreen", "ghostwhite", "#4168B7"], 9))
Or to create a sequential palette along a saturation scale.
In [32]:
sns.palplot(sns.blend_palette([sns.desaturate("#009B76", 0), "#009B76"], 5))
The resulting palettes can be passed to any seaborn function that can take a palette as a parameter.
In [33]:
pal = sns.blend_palette(["seagreen", "lightblue"])
sns.boxplot(data, color=pal);
The set_color_palette
function takes any of these inputs and sets the persistent axis color cycle.
In [34]:
sns.set_color_palette("husl")
sinplot()
You can also temporarily set the color cycle by using the palette_context
function, which is a context manager.
In [35]:
with sns.palette_context("PuBuGn_d"):
sinplot()
The hope is that these tools will make it easier to create plots that are beautiful, both for the sake of beauty itself, and for the ways in which is can enhance the communication of statistical information.