.. _axis_grids:
.. currentmodule:: seaborn
Plotting on data-aware grids
When exploring medium-dimensional data, a useful approach is to draw multiple instances of the same plot on different subsets of your dataset. This technique is sometimes called either "lattice", or `"trellis" `_ plotting, and it is related to the idea of `"small multiples" `_. It allows the viewer to quickly extract a large amount of information about complex data. Matplotlib offers good support for making figures with multiple axes; seaborn builds on top of this to directly link the structure of the plot to the structure of your dataset.
To use these features, your data has to be in a Pandas DataFrame and it must take the form of what Hadley Whickam calls `"tidy" data `_. In brief, that means your dataframe should be structured such that each column is a variable and each row is an observation.
.. _facet_grid:
Subsetting data with :class:`FacetGrid`
---------------------------------------
The :class:`FacetGrid` class is useful when you want to visualize the distribution of a variable or the relationship between multiple variables separately within subsets of your dataset. A :class:`FacetGrid` can be drawn with up to three dimensions: ``row``, ``col``, and ``hue``. The first two have obvious correspondence with the resulting array of axes; think of the hue variable as a third dimension along a depth axis, where different levels are plotted with different colors.
The class is used by initializing a :class:`FacetGrid` object with a dataframe and the names of the variables that will form the row, column, or hue dimensions of the grid. These variables should be categorical or discrete, and then the data at each level of the variable will be used for a facet along that axis. For example, say we wanted to examine differences between lunch and dinner in the ``tips`` dataset.
Initializing the grid like this sets up the matplotlib figure and axes, but doesn't draw anything on them.
The main approach for visualizing data on this grid is with the :meth:`FacetGrid.map` method. Provide it with a plotting function and the name(s) of variable(s) in the dataframe to plot. Let's look at the distribution of tips in each of these subsets, using a histogram.
This function will draw the figure and annotate the axes, hopefully producing a finished plot in one step. To make a relational plot, just pass multiple variable names. You can also provide keyword arguments, which will be passed to the plotting function:
There are several options for controlling the look of the grid that can be passed to the class constructor.
Note that ``margin_titles`` isn't formally supported by the matplotlib API, and may not work well in all cases. (Please open an issue when it doesn't, though, to help it improve).
The size of the figure is set by providing the height of the facets and the aspect ratio:
By default, the facets are plotted in the sorted order of the unique values for each variable, but you can specify an order:
Any seaborn color palette (i.e., someting that can be passed to :func:`color_palette()` can be provided. You can also use a dictionary that maps the names of values in the ``hue`` variable to valid matplotlib colors:
If you have many levels of one variable, you can plot it along the columns but "wrap" them so that they span multiple rows. When doing this, you cannot use a ``row`` variable.
Once you've drawn a plot using :meth:`FacetGrid.map` (which can be called multiple times), you may want to adjust some aspects of the plot. You can do this by directly calling methods on the matplotlib ``Figure`` and ``Axes`` objects, which are stored as member attributes at ``fig`` and ``axes`` (a two-dimensional array), respectively.
There are also a number of methods on the :class:`FacetGrid` object for manipulating the figure at a higher level of abstraction. The most general is :meth:`FacetGrid.set`, and there are other more specialized methods like :meth:`FacetGrid.set_axis_labels`. For example:
Both the :func:`lmplot` and :func:`factorplot` function use :class:`FacetGrid` internally and return the object they have plotted on for additional tweaking.
.. _joint_grid:
Plotting bivariate data with :class:`JointGrid`
-----------------------------------------------
The :class:`JointGrid` can be used when you want to plot the relationship between or joint distribution of two variables along with the marginal distribution of each variable.
Like :class:`FacetGrid`, initializing the object sets up the axes but does not plot anything:
The easiest way to use :class:`JointPlot` is to call :meth:`JointPlot.plot` with three arguments: a function to draw a bivariate plot, a function to draw a univariate plot, and a function to calculate a statistic that summarizes the relationship.
For more flexibility, you can use the separate methods :meth:`JointGrid.plot_joint`, :meth:`JointGrid.plot_marginals`, and :meth:`JointGrid.annotate`:
To control the presentation of the grid, use the ``size`` and ``ratio`` arguments. These control the size of the full figure (which is always square) and the ratio of the joint axes height to the marginal axes height:
The ``space`` keyword argument controls the amount of padding between the axes with the joint plot and the two marginal axes:
The :func:`jointplot` function can draw a nice-looking plot with a single line of code:
It can draw several different kinds of plots, with good defaults chosen for each:
In many cases, :func:`jointplot` should be sufficient for exploratory graphics, but it may easier to use :class:`JointGrid` directly when you need more flexibility than is offered by the canned styles of :func:`jointplot`.